Command line tool for URL parsing and manipulation
Replace the host name of a URL:
$ trurl --url https://curl.se --set host=example.com
https://example.com/
Create a URL by setting components:
$ trurl --set host=example.com --set scheme=ftp
ftp://example.com/
Redirect a URL:
$ trurl --url https://curl.se/we/are.html --redirect here.html
https://curl.se/we/here.html
Change port number:
$ trurl --url https://curl.se/we/../are.html --set port=8080
https://curl.se:8080/are.html
Extract the path from a URL:
$ trurl --url https://curl.se/we/are.html --get '{path}'
/we/are.html
Extract the port from a URL:
$ trurl --url https://curl.se/we/are.html --get '{port}'
443
Append a path segment to a URL:
$ trurl --url https://curl.se/hello --append path=you
https://curl.se/hello/you
Append a query segment to a URL:
$ trurl --url "https://curl.se?name=hello" --append query=search=string
https://curl.se/?name=hello&search=string
Read URLs from stdin:
$ cat urllist.txt | trurl --url-file -
...
Output JSON:
$ trurl "https://fake.host/hello#frag" --set user=::moo:: --json
[
{
"url": "https://%3a%3amoo%3a%3a@fake.host/hello#frag",
"parts": {
"scheme": "https",
"user": "::moo::",
"host": "fake.host",
"path": "/hello",
"fragment": "frag"
}
}
]
Remove tracking tuples from query:
$ trurl "https://curl.se?search=hey&utm_source=tracker" --trim query="utm_*"
https://curl.se/?search=hey
Show a specific query key value:
$ trurl "https://example.com?a=home&here=now&thisthen" -g '{query:a}'
home
Sort the key/value pairs in the query component:
$ trurl "https://example.com?b=a&c=b&a=c" --sort-query
https://example.com?a=c&b=a&c=b
Work with a query that uses a semicolon separator:
$ trurl "https://curl.se?search=fool;page=5" --trim query="search" --query-separator ";"
https://curl.se?page=5
Accept spaces in the URL path:
$ trurl "https://curl.se/this has space/index.html" --accept-space
https://curl.se/this%20has%20space/index.html
It's quite easy to compile the C source with GCC:
$ make
cc -W -Wall -pedantic -g -c -o trurl.o trurl.c
cc trurl.o -lcurl -o trurl
trurl is also available in some Linux distributions. You can try searching for it using the package manager of your preferred distribution.
make
, just like on Linux.Development files of libcurl (e.g. libcurl4-openssl-dev
or
libcurl4-gnutls-dev
) are needed for compilation. Requires libcurl version
7.62.0 or newer (the first libcurl to ship the URL parsing API).
trurl also uses CURLUPART_ZONEID
added in libcurl 7.81.0 and
curl_url_strerror()
added in libcurl 7.80.0
It would certainly be possible to make trurl work with older libcurl versions
if someone wanted to.
trurl builds with libcurl older than 7.81.0 but will then not work as
good. For all the documented goodness, use a more modern libcurl.
/
# protocol user host-ip port path path path querystring fragment
^
#protocol
(?:(?<scheme>[a-zA-Z][a-zA-Z\d+-.]*):)?
(?:
(?:
(?:
\/\/
(?:
#userinfo
(?:((?:[a-zA-Z\d\-._~\!$&'()*+,;=%]*)(?::(?:[a-zA-Z\d\-._~\!$&'()*+,;=:%]*))?)@)?
#host-ip
((?:[a-zA-Z\d-.%]+)|(?:\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(?:\[(?:[a-fA-F\d.:]+)\]))?
#port
(?::(\d*))?
)
)
#slash-path
(
(?:\/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*
)
)
#slash-path
|(\/(?:(?:[a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(?:\/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*))?)
#path
|([a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(?:\/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*)
)?
#querystring
(?:\?([a-zA-Z\d\-._~\!$&'()*+,;=:@%\/?]*))?
#fragment
(?:\#([a-zA-Z\d\-._~\!$&'()*+,;=:@%\/?]*))?
$
/x
/
# allow multiple groups with the same name
(?J)
# protocol user host-ip port path path path querystring fragment
^
#protocol
(?:(?<scheme>[a-zA-Z][a-zA-Z\d+-.]*):)?
(?|
#slash-slash
\/\/
#userinfo
(?:
#user
(?<user>[a-zA-Z\d\-._~\!$&'()*+,;=%]*)
#password
(?::(?<pass>[a-zA-Z\d\-._~\!$&'()*+,;=:%]*))?
@)?
#host-ip
(?<host>(?:[a-zA-Z\d-.%]+)|(?:\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})|(?:\[(?:[a-fA-F\d.:]+)\]))?
#port
(?::(?<port>\d*))?
#slash-path
(?<path>
(?:\/ [a-zA-Z\d\-._~\!$&'()*+,;=:@%]* )*
)
#slash-path
|(?<user>)(?<pass>)(?<host>)(?<port>)
(?<path>\/ [a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(?:\/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*)?
#path
|(?<user>)(?<pass>)(?<host>)(?<port>)
(?<path> [a-zA-Z\d\-._~\!$&'()*+,;=:@%]+(?:\/[a-zA-Z\d\-._~\!$&'()*+,;=:@%]*)*)
)?
#querystring
(?:\?(?<query>[a-zA-Z\d\-._~\!$&'()*+,;=:@%\/?]*))?
#fragment
(?:\#(?<fragment>[a-zA-Z\d\-._~\!$&'()*+,;=:@%\/?]*))?
$
/x
http-request redirect code 301 location https://my.url.com%[path,regsub(^/my_old_path,,)] if { path_beg /my_old_path }
here, the purpose is to redirect a hit on https://url.com/my_old_path/foo/bar/baz
to https://my.url.com/foo/bar/baz