Escaping URLs
Published on 30 Jan 2005Tags #Perl
To ensure that URLs can be displayed regardless of the locale, special characters are substituted by a percent sign followed by their two digit hexadecimal equivalent.
The following two Perl scripts were created to encode and decode URLs.
Note: This procedure is defined in RFC 2396 - Uniform Resource Identifiers (URI): Generic Syntax
Escaping
#!/usr/bin/perl
use strict;
use warnings;
use English;
my $url = $ARGV[0];
for (my $i = 0; $i < length($url); ++$i) {
my $char = substr($url, $i, 1);
# substitute
if ($char eq '+') {
$char = ' ';
}
# translate
if ($char =~ m/^([^a-zA-Z0-9-_/.,:?&=])$/) {
print '%' . unpack('H2', $1);
} else {
print $char;
}
}
print "n";
De-escaping
#!/usr/bin/perl
use strict;
use warnings;
use English;
my $url = $ARGV[0];
$url =~ s/%(..)/pack('c', hex($1))/eg;
print $url . "n";
Feedback is always welcome! If you'd like to get in touch with me concerning the contents of this article, please use Twitter.