Skip to content Skip to footer

Insight

Sitecore Virtual Developer Day 2019: Data minimisation takeaways

by Sam Mullins, Web Developer, 7 March 2019

Read Time: 4 minutes

Last year I tuned in to the first Sitecore Virtual Developer Day, and thought it was a very informative event. This year’s Virtual Developer Day took place March 7th, and was jam-packed with a wide range of talks. There were over a dozen talks ranging from DevOps to content editor user experience, from commerce to containers, from Salesforce to SiteCron.

Over the day I picked up numerous tips and tricks from Sitecore experts. Some were very simple, but would improve the sites we deliver for clients. For instance, I really liked Robert Senktas’ idea of colour coding user-friendly errors and warnings in the experience editor to help the editor comprehend the action that needs to be taken next. I also really enjoyed ‘Helix Smells’ by Martin Davies which outlines some ways to spot if you’re doing something that isn’t quite right.

Outside of this, there were a couple of points that really caught my attention. Much to my surprise, they were around the area of GDPR and data persistence.

The first takeaway demonstrates how data regulation can force a developer to watch their step, while the second shows how data regulation can inspire developers to be more creative and considerate – and actually produce more effective software.

Any developer, even with good intentions, can accidentally break GDPR compliance.

Martina Welander, in her presentation Sitecore Privacy Guide and GDPR managed to make GDPR both approachable and terrifying at the same time.

The definition of personal data has broadened since the introduction of GDPR and, as a result, careful thought has to be put into every piece of data stored in Sitecore. This is typically the responsibility of a Data Protection Officer, so it is very easy for a developer to offload the burden of a €10 million fine to this person.

This cannot be the case though, as Sitecore contains many locations in which personal data could be stored, so developer involvement is required. Again, this responsibility could be assigned to one person, the lead developer on the project. However, Martina highlighted a crucial point: every single developer MUST consider GDPR for every line of code they write.

Do you store any personal information in logs?

Take the following example:

try

{

            _addressRepository.AddAddress(email, address);

}

catch

{

            Sitecore.Diagnostics.Log.Error($"Failed to save address {address} for {email}", this);

}

This error logging may have been considered a good idea before GDPR, as it would have made it easier to track down the specific issue. Now however, great care needs to be taken when logging information like this. If server security was compromised, then personal data would be easily accessible by the hacker. If the user requests for their data to be removed, this logged data would not be deleted.

Every developer needs to be aware of this, and if you build sites with Sitecore, Martina’s presentation is a good place to start. If your development is handled by an agency, it’s good to be aware of the risks too, so you can make sure your agency is compliant in dealing with personal data.

When it comes to the data you store, intelligently process the data before persisting.

As discussed before, GDPR has forced everyone to think about what needs to be stored; it is no longer acceptable to store absolutely everything you can about a user and decide how to use it at a later date. As Martina Welander said in her GDPR talk: “Big data used to be a big thing, now we’re really into data minimisation”

If we want to minimise data persistence, defining a good strategy becomes very important. This strategy needs to dictate how to use technology to store the minimum amount of personal data that can help realise the company’s objectives.

In her talk ‘Where Machine Learning meets Social’, Una Verhoeven discussed creating a book recommendation engine based on the user’s Twitter profile.

Una argued storing all the user’s Twitter profile data was not particularly useful, as age, location, and hashtags do not accurately predict the books that may interest a person. Instead, Una passed information from Twitter into a machine learning service that could return predicted personality types from a Twitter timeline. This decouples the useful data from the source, and essentially reduces the risk of GDPR malpractice while also providing more useful data.

This more accurate data can be associated to outcomes, which can be used to then drive more accurate personalisation based on machine learning.

I thought this was a really good example that showed how to intelligently overcome problems that become apparent when you commit to data minimisation.

I always come away from Sitecore Virtual Developer Days with a fresh perspective on the opportunities and obstacles that we face. Often these lessons learned are in areas that would not be top of my normal Sitecore video playlist – and that’s what makes the day so valuable. I now look forward to SUGCON in a month’s time and all the insights that will bring.

Sign up for insights

Be the first to receive thoughts from our industry experts and the latest Redweb updates

Thank you

By entering my details, I agree to the use and storage of my data as described in Redweb’s privacy policy.