Imgur URL parser

up vote
3
down vote

favorite

Fairly new to Python, and I have been doing a few Edabit challenges, to better help with my problem-solving. I have just completed a some what semi-difficult challenge, and I was hoping for some feed back.

The challenge itself:

Create a function that takes an imgur link (as a string) and extracts
the unique id and type. Return an object containing the unique id, and
a string indicating what type of link it is.

The link could be pointing to:

An album (e.g. http://imgur.com/a/cjh4E)

A gallery (e.g. http://imgur.com/gallery/59npG)

An image (e.g. http://imgur.com/OzZUNMM)

An image (direct link) (e.g. http://i.imgur.com/altd8Ld.png)

Examples

"http://imgur.com/a/cjh4E" Ã¢ÂžÂž id: "cjh4E", type: "album"

"http://imgur.com/gallery/59npG" Ã¢ÂžÂž id: "59npG", type: "gallery"

"http://i.imgur.com/altd8Ld.png" Ã¢ÂžÂž id: "altd8Ld", type: "image"

I came up with the following.

import re

def imgurUrlParser(url):

 url_regex = "^[http://www.|https://www.|http://|https://|www.]*[imgur|i.imgur]*.com"
 url = re.match(url_regex, url).string

 gallery_regex = re.match(url_regex + "(/gallery/)(w+)", url)
 album_regex = re.match(url_regex + "(/a/)(w+)", url)
 image_regex = re.match(url_regex + "/(w+)", url)
 direct_link_regex = re.match(url_regex + "(w+)(.w+)", url)

 if gallery_regex:
 return "id" : gallery_regex.group(2), "type" : "gallery" 
 elif album_regex:
 return "id" : album_regex.group(2), "type" : "album" 
 elif image_regex:
 return "id" : image_regex.group(1), "type" : "image" 
 elif direct_link_regex:
 return "id" : direct_link_regex.group(1), "type" : "image"

edited 5 hours ago

200_success

124k14144401

asked 6 hours ago

snorkle

162

New contributor

This code doesn't run. direct_link_regex is a string and strings don't have a group() method. Also, the parentheses for that regex are not balanced properly. Perhaps you copied an old version of your code?
â€“Â JakeD
6 hours ago

1

Edited. Thanks, Jake.
â€“Â snorkle
6 hours ago

add a commentÂ |Â

up vote
3
down vote

favorite

The challenge itself:

Create a function that takes an imgur link (as a string) and extracts
the unique id and type. Return an object containing the unique id, and
a string indicating what type of link it is.

The link could be pointing to:

An album (e.g. http://imgur.com/a/cjh4E)

A gallery (e.g. http://imgur.com/gallery/59npG)

An image (e.g. http://imgur.com/OzZUNMM)

An image (direct link) (e.g. http://i.imgur.com/altd8Ld.png)

Examples

"http://imgur.com/a/cjh4E" Ã¢ÂžÂž id: "cjh4E", type: "album"

"http://imgur.com/gallery/59npG" Ã¢ÂžÂž id: "59npG", type: "gallery"

"http://i.imgur.com/altd8Ld.png" Ã¢ÂžÂž id: "altd8Ld", type: "image"

I came up with the following.

import re

def imgurUrlParser(url):

 url_regex = "^[http://www.|https://www.|http://|https://|www.]*[imgur|i.imgur]*.com"
 url = re.match(url_regex, url).string

 gallery_regex = re.match(url_regex + "(/gallery/)(w+)", url)
 album_regex = re.match(url_regex + "(/a/)(w+)", url)
 image_regex = re.match(url_regex + "/(w+)", url)
 direct_link_regex = re.match(url_regex + "(w+)(.w+)", url)

 if gallery_regex:
 return "id" : gallery_regex.group(2), "type" : "gallery" 
 elif album_regex:
 return "id" : album_regex.group(2), "type" : "album" 
 elif image_regex:
 return "id" : image_regex.group(1), "type" : "image" 
 elif direct_link_regex:
 return "id" : direct_link_regex.group(1), "type" : "image"

edited 5 hours ago

200_success

124k14144401

asked 6 hours ago

snorkle

162

New contributor

This code doesn't run. direct_link_regex is a string and strings don't have a group() method. Also, the parentheses for that regex are not balanced properly. Perhaps you copied an old version of your code?
â€“Â JakeD
6 hours ago

1

Edited. Thanks, Jake.
â€“Â snorkle
6 hours ago

add a commentÂ |Â

up vote
3
down vote

favorite

The challenge itself:

Create a function that takes an imgur link (as a string) and extracts
the unique id and type. Return an object containing the unique id, and
a string indicating what type of link it is.

The link could be pointing to:

An album (e.g. http://imgur.com/a/cjh4E)

A gallery (e.g. http://imgur.com/gallery/59npG)

An image (e.g. http://imgur.com/OzZUNMM)

An image (direct link) (e.g. http://i.imgur.com/altd8Ld.png)

Examples

"http://imgur.com/a/cjh4E" Ã¢ÂžÂž id: "cjh4E", type: "album"

"http://imgur.com/gallery/59npG" Ã¢ÂžÂž id: "59npG", type: "gallery"

"http://i.imgur.com/altd8Ld.png" Ã¢ÂžÂž id: "altd8Ld", type: "image"

I came up with the following.

import re

def imgurUrlParser(url):

 url_regex = "^[http://www.|https://www.|http://|https://|www.]*[imgur|i.imgur]*.com"
 url = re.match(url_regex, url).string

 gallery_regex = re.match(url_regex + "(/gallery/)(w+)", url)
 album_regex = re.match(url_regex + "(/a/)(w+)", url)
 image_regex = re.match(url_regex + "/(w+)", url)
 direct_link_regex = re.match(url_regex + "(w+)(.w+)", url)

 if gallery_regex:
 return "id" : gallery_regex.group(2), "type" : "gallery" 
 elif album_regex:
 return "id" : album_regex.group(2), "type" : "album" 
 elif image_regex:
 return "id" : image_regex.group(1), "type" : "image" 
 elif direct_link_regex:
 return "id" : direct_link_regex.group(1), "type" : "image"

edited 5 hours ago

200_success

124k14144401

asked 6 hours ago

snorkle

162

New contributor

The challenge itself:

Create a function that takes an imgur link (as a string) and extracts
the unique id and type. Return an object containing the unique id, and
a string indicating what type of link it is.

The link could be pointing to:

An album (e.g. http://imgur.com/a/cjh4E)

A gallery (e.g. http://imgur.com/gallery/59npG)

An image (e.g. http://imgur.com/OzZUNMM)

An image (direct link) (e.g. http://i.imgur.com/altd8Ld.png)

Examples

"http://imgur.com/a/cjh4E" Ã¢ÂžÂž id: "cjh4E", type: "album"

"http://imgur.com/gallery/59npG" Ã¢ÂžÂž id: "59npG", type: "gallery"

"http://i.imgur.com/altd8Ld.png" Ã¢ÂžÂž id: "altd8Ld", type: "image"

I came up with the following.

import re

def imgurUrlParser(url):

 url_regex = "^[http://www.|https://www.|http://|https://|www.]*[imgur|i.imgur]*.com"
 url = re.match(url_regex, url).string

 gallery_regex = re.match(url_regex + "(/gallery/)(w+)", url)
 album_regex = re.match(url_regex + "(/a/)(w+)", url)
 image_regex = re.match(url_regex + "/(w+)", url)
 direct_link_regex = re.match(url_regex + "(w+)(.w+)", url)

 if gallery_regex:
 return "id" : gallery_regex.group(2), "type" : "gallery" 
 elif album_regex:
 return "id" : album_regex.group(2), "type" : "album" 
 elif image_regex:
 return "id" : image_regex.group(1), "type" : "image" 
 elif direct_link_regex:
 return "id" : direct_link_regex.group(1), "type" : "image"

python python-3.x programming-challenge regex url

edited 5 hours ago

200_success

124k14144401

asked 6 hours ago

snorkle

162

New contributor

edited 5 hours ago

200_success

124k14144401

asked 6 hours ago

snorkle

162

New contributor

edited 5 hours ago

200_success

124k14144401

edited 5 hours ago

200_success

124k14144401

edited 5 hours ago

200_success

124k14144401

asked 6 hours ago

snorkle

162

New contributor

asked 6 hours ago

snorkle

162

asked 6 hours ago

snorkle

162

New contributor

snorkle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

This code doesn't run. direct_link_regex is a string and strings don't have a group() method. Also, the parentheses for that regex are not balanced properly. Perhaps you copied an old version of your code?
â€“Â JakeD
6 hours ago

1

Edited. Thanks, Jake.
â€“Â snorkle
6 hours ago

add a commentÂ |Â

This code doesn't run. direct_link_regex is a string and strings don't have a group() method. Also, the parentheses for that regex are not balanced properly. Perhaps you copied an old version of your code?
â€“Â JakeD
6 hours ago

1

Edited. Thanks, Jake.
â€“Â snorkle
6 hours ago

This code doesn't run. direct_link_regex is a string and strings don't have a group() method. Also, the parentheses for that regex are not balanced properly. Perhaps you copied an old version of your code?
â€“Â JakeD
6 hours ago

Edited. Thanks, Jake.
â€“Â snorkle
6 hours ago

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
2
down vote

By RFC 1738, the scheme and host portions of URLs are case-insensitive. Also, it is allowable to include a redundant port number in the URL.

Imgur also also partners with certain other websites. For instance, when you upload an image through the question editor Stack Exchange site, it will end up on https://i.stack.imgur.com.

There is a lot of commonality in the various regexes. Consider combining them all into a single regex. Use named capture groups to avoid the magic group numbers.

A docstring with doctests would be very beneficial for this function.

import re

def parse_imgur_url(url):
 """
 Extract the type and id from an Imgur URL.

 >>> parse_imgur_url('http://imgur.com/a/cjh4E')
 'id': 'cjh4E', 'type': 'album'
 >>> parse_imgur_url('HtTP://imgur.COM:80/gallery/59npG')
 'id': '59npG', 'type': 'gallery'
 >>> parse_imgur_url('https://i.imgur.com/altd8Ld.png')
 'id': 'altd8Ld', 'type': 'image'
 >>> parse_imgur_url('https://i.stack.imgur.com/ELmEk.png')
 'id': 'ELmEk', 'type': 'image'
 >>> parse_imgur_url('http://not-imgur.com/altd8Ld.png') is None
 Traceback (most recent call last):
 ...
 ValueError: "http://not-imgur.com/altd8Ld.png" is not a valid imgur URL
 >>> parse_imgur_url('tftp://imgur.com/gallery/59npG') is None
 Traceback (most recent call last):
 ...
 ValueError: "tftp://imgur.com/gallery/59npG" is not a valid imgur URL
 >>> parse_imgur_url('Blah') is None
 Traceback (most recent call last):
 ...
 ValueError: "Blah" is not a valid imgur URL
 """
 match = re.match(
 r'^(?i:https?://(?:.+.)?imgur.com)(:d+)?'
 r'/(?:(?P<album>a/)|(?P<gallery>gallery/))?(?P<id>w+)',
 url
 )
 if not match:
 raise ValueError('"" is not a valid imgur URL'.format(url))
 return 
 'id': match.group('id'),
 'type': 'album' if match.group('album') else
 'gallery' if match.group('gallery') else
 'image',

Note that the regex above relies on the (?aiLmsux-imsx:...) feature of Python 3.6, and the doctests rely on the predictable order of dictionary keys in Python 3.6 / 3.7.

edited 3 hours ago

answered 3 hours ago

200_success

124k14144401

add a commentÂ |Â

up vote
1
down vote

For a first pass, not bad! Your code is pretty easy to follow.

Problems:

Don't use to match different strings. matches any set of characters, so [imgur|i.imgur]* will match ``, g, mgi, etc. You probably wanted a non-capturing group, which is specified with (?: ...), re Docs

Name functions with snake_case, as recommended by PEP 8.

The challenge as stated doesn't specify what should happen if the string passed in doesn't match the link format. Right now your code will throw an AttributeError, which isn't very helpful to the caller. I'd recommend raising an explicit exception with a more helpful message.

Your last case, direct_link_regex is never reached with valid input since it is handled by image_regex.

Improvements:

Concatenating the regex to handle each case is somewhat messy. It would be better to have a single regex which handles all cases.

Regular expressions are usually expressed using raw strings, that is, strings with an r prefix. This helps with escaping characters correctly. In this case I'm guessing you just got lucky that it worked as you expected.

Including a docstring is always a good idea, and you can even embed tests using doctest.

How I would implement this function:

def imgur_url_parser(url):
 """
 Parses an imgur url into components.

 >>> imgur_url_parser("http://imgur.com/a/cjh4E") == "type": "album", "id": "cjh4E"
 True
 >>> imgur_url_parser("http://imgur.com/gallery/59npG") == "type": "gallery", "id": "59npG"
 True
 >>> imgur_url_parser("http://i.imgur.com/altd8Ld.png") == "type": "image", "id": "altd8Ld"
 True
 >>> imgur_url_parser("http://imgur.com/OzZUNMM") == "type": "image", "id": "OzZUNMM"
 True
 """
 match = re.match(r"^https?://(?:www.|i.)?imgur.com/([w.]+)/?(w*)$", url)
 if not match:
 raise ValueError('The string "" is not a valid imgur link'.format(url))
 # Empty when this is an image link
 if not match.group(2):
 # Remove image extension, if it exists
 image_id = re.sub(r"(.w+)?$", "", match.group(1))
 return "id": image_id, "type": "image" 
 url_type = match.group(1) == "a" and "album" or "gallery"
 return "id": match.group(2), "type": url_type


if __name__ == "__main__":
 import doctest
 doctest.testmod()

answered 5 hours ago

Gerrit0

2,7401520

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

snorkle is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f204316%2fimgur-url-parser%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
2
down vote

By RFC 1738, the scheme and host portions of URLs are case-insensitive. Also, it is allowable to include a redundant port number in the URL.

Imgur also also partners with certain other websites. For instance, when you upload an image through the question editor Stack Exchange site, it will end up on https://i.stack.imgur.com.

There is a lot of commonality in the various regexes. Consider combining them all into a single regex. Use named capture groups to avoid the magic group numbers.

A docstring with doctests would be very beneficial for this function.

import re

def parse_imgur_url(url):
 """
 Extract the type and id from an Imgur URL.

 >>> parse_imgur_url('http://imgur.com/a/cjh4E')
 'id': 'cjh4E', 'type': 'album'
 >>> parse_imgur_url('HtTP://imgur.COM:80/gallery/59npG')
 'id': '59npG', 'type': 'gallery'
 >>> parse_imgur_url('https://i.imgur.com/altd8Ld.png')
 'id': 'altd8Ld', 'type': 'image'
 >>> parse_imgur_url('https://i.stack.imgur.com/ELmEk.png')
 'id': 'ELmEk', 'type': 'image'
 >>> parse_imgur_url('http://not-imgur.com/altd8Ld.png') is None
 Traceback (most recent call last):
 ...
 ValueError: "http://not-imgur.com/altd8Ld.png" is not a valid imgur URL
 >>> parse_imgur_url('tftp://imgur.com/gallery/59npG') is None
 Traceback (most recent call last):
 ...
 ValueError: "tftp://imgur.com/gallery/59npG" is not a valid imgur URL
 >>> parse_imgur_url('Blah') is None
 Traceback (most recent call last):
 ...
 ValueError: "Blah" is not a valid imgur URL
 """
 match = re.match(
 r'^(?i:https?://(?:.+.)?imgur.com)(:d+)?'
 r'/(?:(?P<album>a/)|(?P<gallery>gallery/))?(?P<id>w+)',
 url
 )
 if not match:
 raise ValueError('"" is not a valid imgur URL'.format(url))
 return 
 'id': match.group('id'),
 'type': 'album' if match.group('album') else
 'gallery' if match.group('gallery') else
 'image',

Note that the regex above relies on the (?aiLmsux-imsx:...) feature of Python 3.6, and the doctests rely on the predictable order of dictionary keys in Python 3.6 / 3.7.

edited 3 hours ago

answered 3 hours ago

200_success

124k14144401

add a commentÂ |Â

up vote
2
down vote

By RFC 1738, the scheme and host portions of URLs are case-insensitive. Also, it is allowable to include a redundant port number in the URL.

Imgur also also partners with certain other websites. For instance, when you upload an image through the question editor Stack Exchange site, it will end up on https://i.stack.imgur.com.

There is a lot of commonality in the various regexes. Consider combining them all into a single regex. Use named capture groups to avoid the magic group numbers.

A docstring with doctests would be very beneficial for this function.

import re

def parse_imgur_url(url):
 """
 Extract the type and id from an Imgur URL.

 >>> parse_imgur_url('http://imgur.com/a/cjh4E')
 'id': 'cjh4E', 'type': 'album'
 >>> parse_imgur_url('HtTP://imgur.COM:80/gallery/59npG')
 'id': '59npG', 'type': 'gallery'
 >>> parse_imgur_url('https://i.imgur.com/altd8Ld.png')
 'id': 'altd8Ld', 'type': 'image'
 >>> parse_imgur_url('https://i.stack.imgur.com/ELmEk.png')
 'id': 'ELmEk', 'type': 'image'
 >>> parse_imgur_url('http://not-imgur.com/altd8Ld.png') is None
 Traceback (most recent call last):
 ...
 ValueError: "http://not-imgur.com/altd8Ld.png" is not a valid imgur URL
 >>> parse_imgur_url('tftp://imgur.com/gallery/59npG') is None
 Traceback (most recent call last):
 ...
 ValueError: "tftp://imgur.com/gallery/59npG" is not a valid imgur URL
 >>> parse_imgur_url('Blah') is None
 Traceback (most recent call last):
 ...
 ValueError: "Blah" is not a valid imgur URL
 """
 match = re.match(
 r'^(?i:https?://(?:.+.)?imgur.com)(:d+)?'
 r'/(?:(?P<album>a/)|(?P<gallery>gallery/))?(?P<id>w+)',
 url
 )
 if not match:
 raise ValueError('"" is not a valid imgur URL'.format(url))
 return 
 'id': match.group('id'),
 'type': 'album' if match.group('album') else
 'gallery' if match.group('gallery') else
 'image',

Note that the regex above relies on the (?aiLmsux-imsx:...) feature of Python 3.6, and the doctests rely on the predictable order of dictionary keys in Python 3.6 / 3.7.

edited 3 hours ago

answered 3 hours ago

200_success

124k14144401

add a commentÂ |Â

up vote
2
down vote

By RFC 1738, the scheme and host portions of URLs are case-insensitive. Also, it is allowable to include a redundant port number in the URL.

Imgur also also partners with certain other websites. For instance, when you upload an image through the question editor Stack Exchange site, it will end up on https://i.stack.imgur.com.

There is a lot of commonality in the various regexes. Consider combining them all into a single regex. Use named capture groups to avoid the magic group numbers.

A docstring with doctests would be very beneficial for this function.

import re

def parse_imgur_url(url):
 """
 Extract the type and id from an Imgur URL.

 >>> parse_imgur_url('http://imgur.com/a/cjh4E')
 'id': 'cjh4E', 'type': 'album'
 >>> parse_imgur_url('HtTP://imgur.COM:80/gallery/59npG')
 'id': '59npG', 'type': 'gallery'
 >>> parse_imgur_url('https://i.imgur.com/altd8Ld.png')
 'id': 'altd8Ld', 'type': 'image'
 >>> parse_imgur_url('https://i.stack.imgur.com/ELmEk.png')
 'id': 'ELmEk', 'type': 'image'
 >>> parse_imgur_url('http://not-imgur.com/altd8Ld.png') is None
 Traceback (most recent call last):
 ...
 ValueError: "http://not-imgur.com/altd8Ld.png" is not a valid imgur URL
 >>> parse_imgur_url('tftp://imgur.com/gallery/59npG') is None
 Traceback (most recent call last):
 ...
 ValueError: "tftp://imgur.com/gallery/59npG" is not a valid imgur URL
 >>> parse_imgur_url('Blah') is None
 Traceback (most recent call last):
 ...
 ValueError: "Blah" is not a valid imgur URL
 """
 match = re.match(
 r'^(?i:https?://(?:.+.)?imgur.com)(:d+)?'
 r'/(?:(?P<album>a/)|(?P<gallery>gallery/))?(?P<id>w+)',
 url
 )
 if not match:
 raise ValueError('"" is not a valid imgur URL'.format(url))
 return 
 'id': match.group('id'),
 'type': 'album' if match.group('album') else
 'gallery' if match.group('gallery') else
 'image',

Note that the regex above relies on the (?aiLmsux-imsx:...) feature of Python 3.6, and the doctests rely on the predictable order of dictionary keys in Python 3.6 / 3.7.

edited 3 hours ago

answered 3 hours ago

200_success

124k14144401

By RFC 1738, the scheme and host portions of URLs are case-insensitive. Also, it is allowable to include a redundant port number in the URL.

Imgur also also partners with certain other websites. For instance, when you upload an image through the question editor Stack Exchange site, it will end up on https://i.stack.imgur.com.

There is a lot of commonality in the various regexes. Consider combining them all into a single regex. Use named capture groups to avoid the magic group numbers.

A docstring with doctests would be very beneficial for this function.

import re

def parse_imgur_url(url):
 """
 Extract the type and id from an Imgur URL.

 >>> parse_imgur_url('http://imgur.com/a/cjh4E')
 'id': 'cjh4E', 'type': 'album'
 >>> parse_imgur_url('HtTP://imgur.COM:80/gallery/59npG')
 'id': '59npG', 'type': 'gallery'
 >>> parse_imgur_url('https://i.imgur.com/altd8Ld.png')
 'id': 'altd8Ld', 'type': 'image'
 >>> parse_imgur_url('https://i.stack.imgur.com/ELmEk.png')
 'id': 'ELmEk', 'type': 'image'
 >>> parse_imgur_url('http://not-imgur.com/altd8Ld.png') is None
 Traceback (most recent call last):
 ...
 ValueError: "http://not-imgur.com/altd8Ld.png" is not a valid imgur URL
 >>> parse_imgur_url('tftp://imgur.com/gallery/59npG') is None
 Traceback (most recent call last):
 ...
 ValueError: "tftp://imgur.com/gallery/59npG" is not a valid imgur URL
 >>> parse_imgur_url('Blah') is None
 Traceback (most recent call last):
 ...
 ValueError: "Blah" is not a valid imgur URL
 """
 match = re.match(
 r'^(?i:https?://(?:.+.)?imgur.com)(:d+)?'
 r'/(?:(?P<album>a/)|(?P<gallery>gallery/))?(?P<id>w+)',
 url
 )
 if not match:
 raise ValueError('"" is not a valid imgur URL'.format(url))
 return 
 'id': match.group('id'),
 'type': 'album' if match.group('album') else
 'gallery' if match.group('gallery') else
 'image',

Note that the regex above relies on the (?aiLmsux-imsx:...) feature of Python 3.6, and the doctests rely on the predictable order of dictionary keys in Python 3.6 / 3.7.

edited 3 hours ago

answered 3 hours ago

200_success

124k14144401

edited 3 hours ago

answered 3 hours ago

200_success

124k14144401

answered 3 hours ago

200_success

124k14144401

answered 3 hours ago

200_success

124k14144401

add a commentÂ |Â

up vote
1
down vote

For a first pass, not bad! Your code is pretty easy to follow.

Problems:

Don't use to match different strings. matches any set of characters, so [imgur|i.imgur]* will match ``, g, mgi, etc. You probably wanted a non-capturing group, which is specified with (?: ...), re Docs

Name functions with snake_case, as recommended by PEP 8.

The challenge as stated doesn't specify what should happen if the string passed in doesn't match the link format. Right now your code will throw an AttributeError, which isn't very helpful to the caller. I'd recommend raising an explicit exception with a more helpful message.

Your last case, direct_link_regex is never reached with valid input since it is handled by image_regex.

Improvements:

Concatenating the regex to handle each case is somewhat messy. It would be better to have a single regex which handles all cases.

Regular expressions are usually expressed using raw strings, that is, strings with an r prefix. This helps with escaping characters correctly. In this case I'm guessing you just got lucky that it worked as you expected.

Including a docstring is always a good idea, and you can even embed tests using doctest.

How I would implement this function:

def imgur_url_parser(url):
 """
 Parses an imgur url into components.

 >>> imgur_url_parser("http://imgur.com/a/cjh4E") == "type": "album", "id": "cjh4E"
 True
 >>> imgur_url_parser("http://imgur.com/gallery/59npG") == "type": "gallery", "id": "59npG"
 True
 >>> imgur_url_parser("http://i.imgur.com/altd8Ld.png") == "type": "image", "id": "altd8Ld"
 True
 >>> imgur_url_parser("http://imgur.com/OzZUNMM") == "type": "image", "id": "OzZUNMM"
 True
 """
 match = re.match(r"^https?://(?:www.|i.)?imgur.com/([w.]+)/?(w*)$", url)
 if not match:
 raise ValueError('The string "" is not a valid imgur link'.format(url))
 # Empty when this is an image link
 if not match.group(2):
 # Remove image extension, if it exists
 image_id = re.sub(r"(.w+)?$", "", match.group(1))
 return "id": image_id, "type": "image" 
 url_type = match.group(1) == "a" and "album" or "gallery"
 return "id": match.group(2), "type": url_type


if __name__ == "__main__":
 import doctest
 doctest.testmod()

answered 5 hours ago

Gerrit0

2,7401520

add a commentÂ |Â

up vote
1
down vote

For a first pass, not bad! Your code is pretty easy to follow.

Problems:

Don't use to match different strings. matches any set of characters, so [imgur|i.imgur]* will match ``, g, mgi, etc. You probably wanted a non-capturing group, which is specified with (?: ...), re Docs

Name functions with snake_case, as recommended by PEP 8.

The challenge as stated doesn't specify what should happen if the string passed in doesn't match the link format. Right now your code will throw an AttributeError, which isn't very helpful to the caller. I'd recommend raising an explicit exception with a more helpful message.

Your last case, direct_link_regex is never reached with valid input since it is handled by image_regex.

Improvements:

Concatenating the regex to handle each case is somewhat messy. It would be better to have a single regex which handles all cases.

Regular expressions are usually expressed using raw strings, that is, strings with an r prefix. This helps with escaping characters correctly. In this case I'm guessing you just got lucky that it worked as you expected.

Including a docstring is always a good idea, and you can even embed tests using doctest.

How I would implement this function:

def imgur_url_parser(url):
 """
 Parses an imgur url into components.

 >>> imgur_url_parser("http://imgur.com/a/cjh4E") == "type": "album", "id": "cjh4E"
 True
 >>> imgur_url_parser("http://imgur.com/gallery/59npG") == "type": "gallery", "id": "59npG"
 True
 >>> imgur_url_parser("http://i.imgur.com/altd8Ld.png") == "type": "image", "id": "altd8Ld"
 True
 >>> imgur_url_parser("http://imgur.com/OzZUNMM") == "type": "image", "id": "OzZUNMM"
 True
 """
 match = re.match(r"^https?://(?:www.|i.)?imgur.com/([w.]+)/?(w*)$", url)
 if not match:
 raise ValueError('The string "" is not a valid imgur link'.format(url))
 # Empty when this is an image link
 if not match.group(2):
 # Remove image extension, if it exists
 image_id = re.sub(r"(.w+)?$", "", match.group(1))
 return "id": image_id, "type": "image" 
 url_type = match.group(1) == "a" and "album" or "gallery"
 return "id": match.group(2), "type": url_type


if __name__ == "__main__":
 import doctest
 doctest.testmod()

answered 5 hours ago

Gerrit0

2,7401520

add a commentÂ |Â

up vote
1
down vote

For a first pass, not bad! Your code is pretty easy to follow.

Problems:

Don't use to match different strings. matches any set of characters, so [imgur|i.imgur]* will match ``, g, mgi, etc. You probably wanted a non-capturing group, which is specified with (?: ...), re Docs

Name functions with snake_case, as recommended by PEP 8.

The challenge as stated doesn't specify what should happen if the string passed in doesn't match the link format. Right now your code will throw an AttributeError, which isn't very helpful to the caller. I'd recommend raising an explicit exception with a more helpful message.

Your last case, direct_link_regex is never reached with valid input since it is handled by image_regex.

Improvements:

Concatenating the regex to handle each case is somewhat messy. It would be better to have a single regex which handles all cases.

Regular expressions are usually expressed using raw strings, that is, strings with an r prefix. This helps with escaping characters correctly. In this case I'm guessing you just got lucky that it worked as you expected.

Including a docstring is always a good idea, and you can even embed tests using doctest.

How I would implement this function:

def imgur_url_parser(url):
 """
 Parses an imgur url into components.

 >>> imgur_url_parser("http://imgur.com/a/cjh4E") == "type": "album", "id": "cjh4E"
 True
 >>> imgur_url_parser("http://imgur.com/gallery/59npG") == "type": "gallery", "id": "59npG"
 True
 >>> imgur_url_parser("http://i.imgur.com/altd8Ld.png") == "type": "image", "id": "altd8Ld"
 True
 >>> imgur_url_parser("http://imgur.com/OzZUNMM") == "type": "image", "id": "OzZUNMM"
 True
 """
 match = re.match(r"^https?://(?:www.|i.)?imgur.com/([w.]+)/?(w*)$", url)
 if not match:
 raise ValueError('The string "" is not a valid imgur link'.format(url))
 # Empty when this is an image link
 if not match.group(2):
 # Remove image extension, if it exists
 image_id = re.sub(r"(.w+)?$", "", match.group(1))
 return "id": image_id, "type": "image" 
 url_type = match.group(1) == "a" and "album" or "gallery"
 return "id": match.group(2), "type": url_type


if __name__ == "__main__":
 import doctest
 doctest.testmod()

answered 5 hours ago

Gerrit0

2,7401520

For a first pass, not bad! Your code is pretty easy to follow.

Problems:

Don't use to match different strings. matches any set of characters, so [imgur|i.imgur]* will match ``, g, mgi, etc. You probably wanted a non-capturing group, which is specified with (?: ...), re Docs

Name functions with snake_case, as recommended by PEP 8.

The challenge as stated doesn't specify what should happen if the string passed in doesn't match the link format. Right now your code will throw an AttributeError, which isn't very helpful to the caller. I'd recommend raising an explicit exception with a more helpful message.

Your last case, direct_link_regex is never reached with valid input since it is handled by image_regex.

Improvements:

Concatenating the regex to handle each case is somewhat messy. It would be better to have a single regex which handles all cases.

Regular expressions are usually expressed using raw strings, that is, strings with an r prefix. This helps with escaping characters correctly. In this case I'm guessing you just got lucky that it worked as you expected.

Including a docstring is always a good idea, and you can even embed tests using doctest.

How I would implement this function:

def imgur_url_parser(url):
 """
 Parses an imgur url into components.

 >>> imgur_url_parser("http://imgur.com/a/cjh4E") == "type": "album", "id": "cjh4E"
 True
 >>> imgur_url_parser("http://imgur.com/gallery/59npG") == "type": "gallery", "id": "59npG"
 True
 >>> imgur_url_parser("http://i.imgur.com/altd8Ld.png") == "type": "image", "id": "altd8Ld"
 True
 >>> imgur_url_parser("http://imgur.com/OzZUNMM") == "type": "image", "id": "OzZUNMM"
 True
 """
 match = re.match(r"^https?://(?:www.|i.)?imgur.com/([w.]+)/?(w*)$", url)
 if not match:
 raise ValueError('The string "" is not a valid imgur link'.format(url))
 # Empty when this is an image link
 if not match.group(2):
 # Remove image extension, if it exists
 image_id = re.sub(r"(.w+)?$", "", match.group(1))
 return "id": image_id, "type": "image" 
 url_type = match.group(1) == "a" and "album" or "gallery"
 return "id": match.group(2), "type": url_type


if __name__ == "__main__":
 import doctest
 doctest.testmod()

answered 5 hours ago

Gerrit0

2,7401520

answered 5 hours ago

Gerrit0

2,7401520

answered 5 hours ago

Gerrit0

2,7401520

answered 5 hours ago

Gerrit0

2,7401520

add a commentÂ |Â

snorkle is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

snorkle is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky

Imgur URL parser

Examples

Examples

Examples

Examples

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Comments

Post a Comment

Popular posts from this blog

What does second last employer means? [closed]

List of Gilmore Girls characters

Confectionery

Category

Random preview

Imgur URL parser

Examples

Examples

Examples

Examples

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Comments

Post a Comment

Popular posts from this blog

What does second last employer means? [closed]

List of Gilmore Girls characters

Confectionery

2 Answers
2

2 Answers
2

2 Answers
2