How I Created A Secure, Self-Hosted Survey In Minutes With React + SurveyJS

The importance of fully owning survey data in a privacy-focused world.

June 23rd, 2022

Online questionnaires - whether for the education sector, healthcare, or simply employee feedback - frequently request information for which respondents must be guaranteed anonymity and privacy so you, the survey maker, can remain legally compliant and ensure honest answers.

So the software you use to create and host surveys needs to be *more *than just a third-party, black-box SaaS platform where you are locked out of engagement, monitoring, or improvement on the actual service because of intellectual property laws.

To demonstrate this point, we'll build a survey with SurveyJS - a fast, versatile, free and open-source (MIT license) JavaScript library for forms and surveys that is self-hostable and lets you retain full ownership of respondents' data, *without *giving up privacy, or being locked into a service you can't control.

But first...let's deal with the obvious question.

"Privacy"? Why Should I Care?

Even when surveys are conducted anonymously, and with informed consent, you are not off the hook.

Even if you store only aggregated data, small sample sizes could inadvertently identify an individual, leading to adverse consequences. Also, most countries have laws governing the storage and security of collected data (HIPAA, FERPA, GDPR), meaning you are liable for any stored personal information in your custody - including its secure destruction at the end of the retention period, with proof.

The easiest way to ensure individual privacy and legal compliance, is to self-host SurveyJS. You would run surveys and store responses *completely *on your own infrastructure, thus being in total control of data flow between server and client without any third-party involved.

As for the benefits? How about...

SurveyJS is distributed as client-side libraries for ReactAngularVue.jsKnockout, and of course, jQuery. You also have total freedom of choice as to the backend, because *any *server + database combination is fully compatible with SurveyJS as long as it supports storage and processing of JSON, text, or BLOB fields.

The fact that SurveyJS is so loosely-coupled means you're free to secure and administer each layer of software and hardware underneath *exactly *the way you want; SSL, firewalls, Linux-only, go wild. Plus, you don't have to change anything in the actual survey application code to comply with future regulations.

And finally, SurveyJS supports webhooks; meaning you have full control over what to do with the data (and how to do it) after a respondent completes a survey. You'll usually want to store it in a database, release it as and when required to legal bodies, and delete it once you're done building your business/internal metrics.

Note: While not SurveyJS specific, having full control over data in-house also means you can manage access to it on a need-to-know basis, ensuring a least-privileges policy.

2. Customization and Extendability.

You can use custom CSS and included themes to make your surveys look great and stand out with your brand's own design language, not the samey defaults provided by the third-party vendor.

SurveyJS is also extendable, with third-party JavaScript components like SortableJSreact-tag-box, etc. You could just mix-and-match components from different libraries to give you the exact functionality you want.

Also a feature: populating choices using REST APIs that you can define in the survey/form schema itself, without any JavaScript code for asynchronous XHR.

3. Avoiding vendor lock-in.

Don't you hate it when a third-party solution's performance standards go down, yet they keep raising prices? What about when they drop support for technology that you're dependent on?

In those cases, you have no recourse. You're fully dependent on the vendor for business-critical data because you're locked-in to a bad choice with too many dependencies, and SaaS-supported software is never 100% vendor-agnostic so you'll always incur costs if you want to move your business elsewhere.

Self-hosting a SurveyJS solution grants you true freedom, and makes long-term planning not only possible, but also viable.

Alright. So How Do I Do It?

Here's the game plan.

The SurveyJS offering consists of four individual products, as you can see, but we're only concerned with our server, and the client-side SurveyJS Library.

Our survey will be built using React components implementing the SurveyJS library, on top of an Express backend that proxies requests back to our real server.

SurveyJS uses a data-driven approach - meaning you'll be defining the survey as a data model (a schema) written in JSON, that your React app retrieves from the backend via the Express REST API, and uses to render the survey using templates. On completion, it will then send over the survey response (also as JSON) the same way, and your backend will add it to the database.

Note: For inspiration, feel free to check out the example services for Node.js, ASP.NET MVC, and PHP provided by SurveyJS to help you get a backend up and running quickly. Note that these are only here as a guide, and are not recommended for use in production as-is.

In terms of security, here's what we're ensuring in this code example :

  1. Proper validation, to get rid of rogue HTML inside survey responses that trigger malicious behavior.
  2. Protection from common attack vectors like XSS and CSRF.
  3. Using a backend to proxy requests back to the actual server, as a kind of web filter.

Three things you couldn't do if you *weren't *self-hosting this survey.

So let's get down to the code.

1. The Schema

This is a pretty standard students' course survey with a bit of conditional logic, an ideal case-study from a data security/privacy standpoint because this information could be protected under FERPA when in the US.

Our first layer of protection is validation with regular expressions, making sure no HTML is allowed in the responses for potential Cross-site scripting (XSS) attacks.

We'll also protect against this on the backend, but meanwhile, SurveyJS provides a robust validation system that you can define in the JSON schema itself with regexes, sidestepping the need for any additional code bloating up the React frontend.

Plus, this kind of validation is good UX design, giving respondents instant feedback on what they're doing wrong, instead of having to wait until the survey is sent.

2. The Frontend

First of all, we're retrieving the schema from the backend (getSchema()) and using templates to render the survey from that schema. Then, we're using SurveyJS' support for webhooks (the onComplete trigger) to POST our respondent's submitted survey results to the backend, where it'll be added to the database.

To protect against CSRF attacks, we make use of cookies as authorization to generate CSRF tokens on the backend (that's what getCSRFToken() does on first load) that we then use to make a legitimate POST request (in addToDatabase()).

Also, here we have our second layer of protection against XSS attacks - simply using React's default data binding with curly braces to render data through JSX, and letting React sanitize the HTML.

Server-side security measures like dealing with XSS/CSRF attacks are beyond the scope of SurveyJS as a client-side library. They're only included in this code example to give you an idea of what a complete self-hosted instance of a survey should look like.

3. The Backend

We have the rest of our CSRF protection strategy implemented here, using the CORS middleware to make sure that requests are only considered legitimate if they :

a)...have communicated beforehand to get an approved CSRF token,

b)...are using that *same *token, and

c)...are *only *from an approved list of origin URLs (we're storing those in the allowed_origins variable in our dotenv)

If you're using the Node.js backend example provided by SurveyJS, note that we're only using client-side cookies for authorization here, while that example uses sessions (server *and *client-side) via the express-session package.

And that's a wrap! Fire up the Express API server and the React app, and give it a try!

Power, Back In Your Hands.

If you want to collect data for your surveys in a legally compliant manner and protect the anonymity and privacy of your respondents, there are professional and ethical codes of conduct that *must *be taken into consideration, and several moving parts that must work in unison. These are all very different tasks, involving very different disciplines.

Hopefully, you now have an understanding of how using SurveyJS - being free, open-source, and under the very permissive MIT license - would enable you to do just that, and solve these concerns of privacy, data ownership, customization, and vendor lock, out of the box.

Just don't forget to *also *draft a comprehensive Privacy Policy!