Blank screen is shown in Jira
Incident Report for Jira
Postmortem

Summary

From 12/Aug/24 10:39 UTC to 13/Aug/24 14:38 UTC some customers experienced degraded performance on Issue View in Jira. This was caused by a data processing compatibility problem between a cache, the underlying database, and the new deployment.

Due to a slow increase in failure rate and a small initial surface area of impact, the problem didn’t immediately trigger our continuous error monitoring and alerting.

Once we identified the issue it was resolving itself through self-healing mechanisms in the infrastructure. However, in a few outlier cases, we had to intervene with tenant specific cache recalculations. All but 6 tenants were fully remediated by 12/Aug/24 21:30 UTC.

The issue occurred on the read layer of our architecture so while customer experience was degraded, there was no data loss.

IMPACT

About 1% of instances in were impacted over the lifetime of the incident. Users on those impacted instances would have experienced degradation when loading Issue View in a specific scenario. This was when a Multi Select Custom Field is enabled on an Issue and where that Custom Field also had a Default Value set.

ROOT CAUSE

We introduced a change in our code which caused processing of Custom Fields in specific configurations to fail. This prevented Issue View from loading issues for projects with the above specific configuration applied.

This problem occurred because of different representations of the data in the database, in the code base, and in the cache in the production environment. These multiple representations caused an exception when translating the data from one representation to the next.

REMEDIAL ACTIONS PLAN & NEXT STEPS

The problem largely self healed as the cache expired and was refreshed with compatible data; however, we chose to force cache re-computation for affected tenants in order to expedite this process.

We chose not to roll back the deployment at that point as that would have created the reverse compatibility issue with the already healed tenants.

Instead, we focussed on forward fixing with a hotfix and accelerating remediation for still affected tenants. For a small very number of tenants, forced re-computation did not immediately rectify and we had to roll forward a code hotfix to remediate.

We are prioritizing the following improvement actions to avoid repeating this type of incident:

  • The already deployed hot fix to stop this particular problem recurring.
  • A series of tests for this class of issue in our read layer.
  • A review of monitoring to detect these fine grained problems before they cause more customer impact.

We apologize for any disruption this issue may have caused and are taking steps to help ensure it does not occur again.

Thanks,

Atlassian Customer Support

Posted Aug 15, 2024 - 12:16 UTC

Resolved
The fix for the incident has been deployed and the issue has been resolved.
Posted Aug 13, 2024 - 15:34 UTC
Update
The root cause has been identified, and the hotfix is currently being deployed. It is estimated that the process will be completed for all tenants within the next 2 hours.
Posted Aug 13, 2024 - 12:55 UTC
Identified
The root cause was identified and the hotfix is being deployed.
Posted Aug 13, 2024 - 07:07 UTC
Update
The issue was largely remediated. Team is working on fixing on a few remaining tenants.
Posted Aug 13, 2024 - 00:49 UTC
Update
Team is investigating a potential root cause for the issue. Fix is yet to be rolled out.
Posted Aug 12, 2024 - 19:00 UTC
Update
Team is actively working on a fix to mitigate the issue. Root cause is still to be determined.
Posted Aug 12, 2024 - 18:15 UTC
Update
Team is still working on a potential fix to mitigate the issue while investigation into root cause continues.
Posted Aug 12, 2024 - 17:46 UTC
Update
The issue is still ongoing at the moment, across multiple products. The team is actively working on a fix.
Posted Aug 12, 2024 - 17:09 UTC
Investigating
We've identified an issue in Jira where, upon loading issues in multiple projects, a blank screen is shown.

The team is currently working on identifying the root cause and resolving it.
Posted Aug 12, 2024 - 15:51 UTC
This incident affected: Viewing content and Search.