Making robots.txt environment specific

Hello,

I would like to have robots.txt empty in production, but in my staging environment that it contains:

User-agent: *
Disallow: /

Is there a way to simply achieve this?

Thanks

robots.txt is a static file pushed to web root by copy-webpack-plugin.

I think your solution is to check env in webpack, and change copy-webpack-plugin config to depend on it.

1 Like

I would simply define a route for it and dynamically render it by environment.

3 Likes

This could make sense if I need dynamic content based on app’s data (eg. excluding some route based on ids etc.) but in my case, the webpack solution shall be more accurate.

Another option could be to have your configuration management tool write your robots.txt file. That’s what I do in Ansible. Then I just write a different robots.txt depending on which server / environment I’m deploying to and the code for it is alongside my other environment specific settings.

1 Like

The Webpack solution means you have to create a separate build for each environment. So if you are fine with the state on the staging system, you have to create a new build for the prod system instead of adjusting some env vars.

What do you mean by

your configuration management tool write your robots.txt file

?

Certain configuration management tools allow you to have different settings for different environments. You could have your tool of choice write a robots.txt file to your staging server one way but have it different on your production server.

In other words your robots.txt file is managed by the tool you use to deploy your site instead of being part of your code repo. There’s little value in having it as part of your code repo IMO, especially since you’ll very likely want it to be different between environments.

Also in the future, if you want a dynamic robots.txt file you can either generate it on the spot with a custom Phoenix route, or have a cron job on your server generate a robots.txt file on whatever schedule you want so you can serve it with nginx or a CDN.

In all cases it doesn’t involve having a robots.txt file directly in your code repo.

1 Like

Reviving this thread with a solution on how to achieve this behavior with a Plug, since I haven’t been able to find an answer here or on the Fly.io forums:

lib/my_app_web/plugs/robots.ex

defmodule MyAppWeb.Plugs.Robots do
  @behaviour Plug

  import Plug.Conn

  @impl true
  def init(opts), do: opts

  def call(%Plug.Conn{request_path: "/robots.txt"} = conn, _opts) do
    deploy_env = Application.get_env(:unill, :deploy_env)
    file = Application.app_dir(:unill, "priv/static/robots-#{deploy_env}.txt")


    content = File.read!(file)

    conn
    |> send_resp(200, content)
    |> halt()
  end

  @impl true
  def call(conn, _opts), do: conn
end

lib/my_app_web/endpoint.ex

defmodule MyAppWeb.Endpoint do
  use Phoenix.Endpoint, otp_app: :my_app

  plug MyAppWeb.Plugs.Robots

config/runtime.exs

deploy_env =
  case System.get_env("PHX_HOST") do
    "myapp.com" -> :production
    "staging.myapp.com" -> :staging
    _ -> :dev
  end

config :my_app, :deploy_env, deploy_env

Then, create the relevant robots-staging.txt and robots-production.txt files in priv/static/.

Remember to remove the robots.txt from the static_paths in lib/my_app_web.ex, although this plug should load and return a response before it if you’ve added it before the Plug.Static configuration.

1 Like