Bug #1933: url parser issues - AdiIRC - AdiIRC Support/Bugs/Feature Requests

Actions

Copy link

Bug #1933

closed

url parser issues

Added by Dmytro Borys over 10 years ago. Updated over 9 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Per Amundsen

Category:

Interface

Target version:

2.3

Start date:

02/25/2015

Due date:

% Done:

Estimated time:

Operative System:

All

Regression:

Description

A lot of times when using certain IRC in-channel search services, they return the URL adorned by some kind of parentheses as a decoration on both sides of the URL string. Most of the time, AdiIRC matches such trailing characters as a part of the URL and upon clicking it, tries to load a webpage with such symbol(s) included, which usually ends in 404 or 403 error.

I'm attaching a screenshot in which you can see such behavior. When surrounding an URL with [], the closing bracket gets treated as the part of URL in 2 of 3 cases. Also, sometimes people who care for punctuation post something like "Hey guys, check this link: http:\\blah.blah. It's my new website!". The dot at the end of the url is entended as a sentence separator, not part of the URL but gets included into it by the parser anyway. In my opinion, regex patterns such as "\.\s" or "\.$" should be excluded from the url string by default since there are much more cases of mistakenly treating sentence end dot as a part of an url than there are actual URLs which actually look like that.

This is related to the latest version of the client for 64-bit Windows.

Files

ss+(2015-02-25+at+12.44.26).png (29 KB) ss+(2015-02-25+at+12.44.26).png

Dmytro Borys, 02/25/2015 12:44 AM

Actions

Copy link

Updated by Per Amundsen over 10 years ago

Category set to Interface
Status changed from New to Assigned
Assignee set to Per Amundsen
Target version set to 1.9.6

I have this on TODO already, but it's a little complicated since "])." are valid URL characters after a /slash, will get around to it.

Actions

Copy link

Updated by Jonathan Kay over 9 years ago

Just to add on top of these examples already listed, I frequently see links (usually from bots) which include the greater-than sign, such as:

<SomeBot> Results: Google Nothing <http://www.google.com/search>

This situation results in the same problem, the link includes the ending > and as expected, navigating to it gives a 404.

Actions

Copy link

Updated by Per Amundsen over 9 years ago

Status changed from Assigned to Resolved
Target version changed from 1.9.6 to 2.3

I wrote a new parser which will be in next beta, it will remove trailing ") > ] }" if there is a leading one, and a trailing ".".

Actions

Copy link

Updated by Per Amundsen over 9 years ago

Status changed from Resolved to Closed

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

AdiIRC

Bug #1933

url parser issues

Updated by Per Amundsen over 10 years ago

Updated by Jonathan Kay over 9 years ago

Updated by Per Amundsen over 9 years ago

Updated by Per Amundsen over 9 years ago