Criticize my idea for integration testing with automatically managed assertions of the whole HTML

I want to speed up my integration testing by making tests in the following way:

1/ setup by creating records in the database with fixed data. Even fixing (setting) the current time, so the inserted_at values are the same on each test run.

2/ do some actions (load a LV/non-LV page, click a link or trigger LV action)

3/ KEY PART: instead of multiple assertions of phrases or html tags/attributes, I call my CustomTesting.assert(html, call_identifier) function that compares the html output (the whole webpage) to the previously stored html output during the prevous successful test run for this particular assertion call (identified by call_identifier).

  • during the first run the html output is sent to the browser for me to see it and visually inspect that everything is fine. Then I click “Store” button at the very bottom to store this html output for future test runs.

  • when such assertion fails it will open two tabs in the browser, making a diff (like kdiff, git diff). The first tab will load the html from the test run and for each difference it will surround it with a div tag having a thick red border. In the second tab the same will happen for the expected html with green border. Below I will have a button “Store” to replace the expected html with the newly produced html if I want to.

4/ I can continue with more actions and CustomTesting.assert() calls.

Advantages that I see over using several (sometimes more than 10) small assertions to test that the key elements of the webpage are ok:

1/ I don’t have to write these mini assertions, I don’t have to update them manually when my main code changes. I don’t have to think what is important to assert and what isn’t and many other details like - is it possible in the sidebar to have the same phrase I assert for (now or in the future version of my app).

2/ The full html assertions are easy to manage even though they are brittle (easy to break). If I change my footer, for example, all the tests will fail, but in CustomTesting I can have a func to run all the tests and replace all the HTML output (because I know that the footer change is the only change and it will not break my actual code, so I don’t have to inspect each test manually).

3/ Manual inspections of failed tests happen in the browser with regions in red/green to guide me where the difference is. I can fix my main code, I can fix my tests setup/actions or I can click the “Store” button at the bottom to fix the test assertion.

What bothers me:

  • I can’t find similar approach or testing tool being used in any programming language (maybe there is). If there isn’t - there is probably a good reason why? The only similar thing I found is Snapshot Testing (an overkill in my opinion) - it takes snapshots of the page (like .jpg) and then compares the images.

Can you think of why this approach is a bad idea and stop me, before I start implementing it?

Taking the idea a step further:

To automate step 2 (actions) - by doing them in the browser and having a LV extension to record my actioins and store them in JSON or in Elixir code.

To automate step 1 (setup) - you can do the actions in the browser, but istead of storing the actions, you store the db state in the end. That improves the performance of future test runs. Also, you can store the actions as well and run them automatically to ensure that you have a proper setup (to automate updating the setup) - these test runs will be slow and often an overkill, so run them only when you have a good reason.

Snapshot testing (also found it called approval testing) is afaik the name for the more general idea of storing a previous result to compare to a later ones. The implementation of that idea might be comparing jpegs, html or really anything you can compare.

However I’m not aware of any library, which would implement it using the approach you described. With automated testing I think that approaches, which take manual input, are generally not as popular.

1 Like

Maybe you can take some inspiration from mneme, which implements snapshot testing (but not entirely in the way you describe).

4 Likes

Some rare apps regenerate the CSRF hidden tag value often. If yours were to be one of them, a complete snapshot of the page will not be valuable for comparison.

I do sympathize that you don’t want to manually assert on a bunch of things but you can always make helpers for it? E.g. assert_breadcrumbs or assert_login_links etc.?

1 Like

You could probably strip those tags, or load the page a few times and remove everything dynamic.
But still overall it seems horribly tedious.

I don’t know what exactly you want to test, but does the actual HTML code even matter?
Theres gotta be some library/framework built on top of Selenium/etc where instead of HTML code you can just assert “there must be an element with text xyz visible on the page”

1 Like

The HTML matters because what is a web app - you put a request IN and you get HTML OUT. So HTML is the main thing that matters. It is tedious to test each and every element with text xyz that is important, because you have to do so manually. It is completely automated and super easy to test the whole HTML, because you have to add 1 line of code (which can be the same for every assertion) and then run the test twice, as you said (a nice idea to eliminate/isolate dynamic stuff). Also you see the diff of the failed test in browser visually and determining what went wrong now costs you less mental effort, because it is visual.