Solution for Reverse Proxying onto URLs that are not amenable to being relocated to a subdirectory

I posted this question on serverfault.com, and it turned into "War & Peace" so I decided to duplicate it here to capture my on going battle with the problem.
I am also motivated to do this because stackexchange have implemented some annoying "community wiki" feature which declares posts to be "community owned" after a certain number of edits, which is most annoying because it steals any reputation points that you might get after that point.

My work-around to the community wiki problem is to maintain the question and answer content for editing elsewhere, so you reduce the number of individual "edits" that serverfault.com sees to below the cut off.

I am starting to believe the old adage that you don't know a system well, until you know something about it that really annoys you. ;-)




Short Story
Gah! I wish the developers who did admin interfaces would expose a "webroot=/myAppAppearsHere" option, or make all links relative, because when I have 10 different admin interfaces like chef, webmin, logstash, syslog-ng, I don't also want to have 10 login screens. I want single sign-on supported transparently with Apache HTTP mod_proxy.

Long Story

I have an admin portal for a customer that is basically an apache mod_auth login and then a series of links on to back-end admin pages like so;

https://portal.mysite.com/login    
https://portal.mysite.com/
 
and then a bunch of links like so
https://portal.mysite.com/monitoring   -> https://nagios.localdomain/nagios
https://portal.mysite.com/munin     -> https://munin.localdomain/nagios
https://portal.mysite.com/bacukups     -> https://backups.localdomain/backups
 
However there are a couple of applications are that really not happy with being reverse proxied to a subdirectory, for example chef-server-webui and the logstash web interface.

ProxyPassReverse will remap the headers, but all the internal absolute URLS need to be altered and if there is no option for this in the app config, then this has to be coerced into the HTML response.

The obvious tactic is to create subdomains, or wildcard subdomains to map to these apps like so;
https://chef.mysite.com/   -> https://chefserver.localdomain:4040/
https://logstash.mysite.com/   -> https://logstash.localdomain/
https://*.mysite.com/   -> https://($1).localdomain/
 
But unfortunately I am not in control of the administration of the domain, and obtaining these additions is possible but a pain. (but I would prefer a solution that doesn't require some 3rd party to be involved for each new link) (I am aware a wild-card would solve this, but I am interested in seeing what HTTP and apache based alternatives there are ... for learning etc ;-)

So I have started using the Apache2::ModProxyPerlHtml which is similar to mod_proxy_html, and allows dynamic remapping of strings in the docs. This actually does work with some combination of LocationMatch and ProxyHTMLRewrite I can even get the javascript to play nice. However it is a massive ball-ache to do each one, especially for any non web 1.0 apps.
For example the following almost fixes logstash to work correctly under /logstash;


    RequestHeader   unset   Accept-Encoding
    PerlSetVar ProxyHTMLVerbose "On"
    PerlInputFilterHandler Apache2::ModProxyPerlHtml
    PerlOutputFilterHandler Apache2::ModProxyPerlHtml
    SetHandler perl-script
    PerlAddVar ProxyHTMLRewrite "/style.css /logstash/style.css"
    PerlAddVar ProxyHTMLRewrite "/css/smoothness/jquery-ui-1.8.5.custom.css /logstash/css/smoothness/jquery-ui-1.8.5.custom.css"
    PerlAddVar ProxyHTMLRewrite "/js/jquery-1.6.1.min.js /logstash/js/jquery-1.6.1.min.js"
    PerlAddVar ProxyHTMLRewrite "action='/search' action='/logstash/search'"
    PerlAddVar ProxyHTMLRewrite "/js/jquery-ui-1.8.13.min.js /logstash/js/jquery-ui-1.8.13.min.js"
    PerlAddVar ProxyHTMLRewrite "/media/throbber.gif /logstash/media/throbber.gif"

    PerlAddVar ProxyHTMLRewrite "/api/search /logstash/api/search"
    PerlAddVar ProxyHTMLRewrite "/api/histogram /logstash/api/histogram"


But its extremely hit and miss, and you can't just wildcard the URL swap, because there is loads of JSON and javascript that gets mangled.
I was thinking of some sort of cookie or querystring var that tracked the current Proxy Backend, so apache could dynamically redirect the request onto the correct backend..

Something like this;
https://admin.mysite.com/?request-proxy=chef -> https://chefserver.localdomain:4040/
https://admin.mysite.com/?request-proxy=logstash  -> https://logstash.localdomain/
 
 
And basically, as apache gets last look at all the server HTTP content, it could dynamically tag urls with the additional query vars &request-proxy=logstash. However I am thinking that would suffer from the same problem as the ModProxyPerlHtml/mod_proxy_html solution in that it would never work everywhere, especially in apps where some javascript was used to bejiggle with the QUERY params client side.

I guess a cookie would almost work, in that you could Proxy based on some passed cookie value say "request-proxy=logstash", however this would suffer a problem if you had 2 tabs open to the site as they would probably write over each others cookies.
I know that some apps just take some sort of brute force approach and wrap the whole proxied request in re-baked html such as the Netscreen SA-3000.
Anyway, so are there any apache modules that implement either of these strategies, or somehow side step writing matching rules for each proxied site.?
  1. ps I am aware of lemonldap, but I didn't get far without having to dive down into perl code. Although it looks cool and I will take another look in future.
  2. I am starting to suspect that time-wise I might as well just spend the time remapping these HTML pages with ModProxyPerlHtml, because there won't be a one-size fits all solution.

No comments:

Post a Comment

Don't be nasty. Being rude is fine.