I’m new to Phoenix and web development, so if I missed something in the docs, please point me there and I’ll study.
I’m helping someone with a Phoenix project where they’d like to differentiate between a web browser loading a page and curl/wget loading it. The idea is that when loaded by a browser, the user gets HTML/etc. with possibly some options for interacting more with the site. When loaded by curl or wget, the user would get text with ANSI escape codes so that it would look nice when dumped to a terminal window.
It seems like we want to check the user-agent string. Is it a best practice to put that check in the view, controller, or router? At first this seemed like a view concern, but it seems really convenient to put it in the router. Like maybe in an :accepts-like plug, but checking the user-agent. Plus I could cut out pipeline steps for web browsers that aren’t needed for curl/wget commandline clients.
As curl is mostly used for debugging or scripting rather than to consume the site in a human readable format, I’d just treat it as a regular browser.
If I were scripting and my XPATH that I have carefully checked using the browser wouldn’t work using curl anymore, because all of a sudden it just gets preformatted plaintext, this would break with all my expectations about how curl is used.
Last but not least, if I really want to browse on a terminal, I’d use lynx, links, or brow.sh rather than curl | less…
But I really like the idea of providing alternatives to HTML through content negotiation. Perhaps a site which can be fully consumed via MD and a MD browser? This could save a lot of bandwith on small dataplans…
If it’s mean to be an actual feature and not some rarely used debugging tool I’d opt for a fitting content-type like mentioned above. This makes it easy for curl to still get the html instead as well.
This is true for the DOM viewer in browsers’ dev tools. It’s not meant to show source code, but the DOM. I’ve not yet had problems with using context menu > view source (or whatever it’s named in browsers). Those should still show what’s being sent by the server.
It is fascinating how someone asks a question and people just invalidate the idea at all.
Maybe that person wants to build something like curl wttr.in? That basically does the same thing. If opened via curl, you get a response for a pretty console output and if opened via a browser, you get the same but in html and some social sharing widgets.
From a usability perspective, this is a really great solution. Imagine you would need to explicitly set an Accept header each time. That would be annoying AF.
Imagine I use wget in a html-to-pdf service and all I get back is the terminal text and not the html. That can be just as annoying AF. The proper way is content type negotiation if the url shall stay the same. The rest is weighing convenience in one place against the inconvenience in others. What @NobbZ posted above is his version of “this is inconvenient”.
Using it as a way of determining the representation of an HTTP response seems just as strange to me - but I acknowledge that probably makes me an outlier these days.
The primary intent of the Accept header is to allow a consumer to specify what format is acceptable. And Phoenix has a really nice and clean way to respect that request. I don’t want, as a consumer, for you to decide what representation I should have. You tell me what you can deliver, I’ll tell you what I want.
Any other approach is rife with ambiguity. If you decide what to send me based upon my user agent, and I specify an Accept header - which representation are you going to decide to send me?
In this case, the Accept header would have higher priority to the user-agent.
Let’s go back to the curl example. Do you really want to type curl -H "Accept: text/plain" http://wttr.in all the time? In my case, I had to first google how to set a header via curl because I use it very rarely.
Sometimes the most usable solution is not the “correct solution” those professors want to teach you in university.
What’s a proper user-agent though? curl is just as valid as any other user-agent sent by certain browsers. As it’s implemented right now even setting Accept: text/html won’t make wttr.in send back html content. The only way to get back html is to not use one of the hardcoded “terminal” user agents, which are just assumed to want text returned not html.
In my html-to-pdf example I wasn’t really talking about me being in control of that. I was talking about some random service on the web, where I simply supply an url and get back a pdf.
That’s true, but it’s a tradeoff to be decided by whomever is in charge of implementation.
But the discussion of backdraws has it’s usefulness as well. You already mentioned letting the Accept header trump the user-agent. We also know that browsers will send that header with text/html requested. So maybe the best way is to send text for Accept: */*, which is the default for curl/wget and send html only when requested via the Accept header, to cater for browsers. If I want html in the terminal I can add the header, which does solve the imagined example of a html-to-pdf service, which really should set the header to request html to be returned.
To me this sounds like a solution without requiring terminal users to set an accept header to get text returned, while still using only content negotiation for any tool to access the html content.