Open-source News

AMDVLK 2023.Q1.2 Vulkan Driver Released With Fixes, New Extension

Phoronix - Mon, 02/20/2023 - 19:52
Out today is AMDVLK 2023.Q1.2 as AMD's second open-source Vulkan Linux driver release of the year...

EROFS Gets Low-Latency Decompression For Much Better Performance

Phoronix - Mon, 02/20/2023 - 19:39
The EROFS file-system updates for Linux 6.3 include introducing a new option for per-CPU KThreads to provide low-latency decompression for speeding up use of compressed EROFS file-systems on Android devices...

Qt Safe Renderer 2.0 Released To Enhance Functional Safety UIs

Phoronix - Mon, 02/20/2023 - 19:29
The Qt Group has released Qt Safe Renderer 2.0 as the newest version of their Qt renderer focused on functional safety for rendering user interface elements of utmost importance such as critical interfaces within automobiles and airplanes...

Linux 6.3 RAS/EDAC Changes Bring New Features For Intel & AMD

Phoronix - Mon, 02/20/2023 - 19:06
Among the early pull requests for the now-open Linux 6.3 merge window are the RAS (Reliability, Availability and Serviceability) and EDAC (Error Detection And Correction) updates...

4 questions open source engineers should ask to mitigate risk at scale

opensource.com - Mon, 02/20/2023 - 16:00
4 questions open source engineers should ask to mitigate risk at scale kathryn.xtang@… Mon, 02/20/2023 - 03:00

At Shopify, we use and maintain a lot of open source projects, and every year we prepare for Black Friday Cyber Monday (BFCM) and other high-traffic events to make sure our merchants can sell to their buyers. To do this, we built an infrastructure platform at a large scale that is highly complex, interconnected, globally distributed, requiring thoughtful technology investments from a network of teams. We’re changing how the internet works, where no single person can oversee the full design and detail at our scale.

Over BFCM 2022, we served 75.98M requests per minute to our commerce platform at peak. That’s 1.27M requests per second. Working at this massive scale in a complex and interdependent system, it would be impossible to identify and mitigate every possible risk. This article breaks down a high-level risk mitigation process into four questions that can be applied to nearly any scenario to help you make the best use of your time and resources available.

1. What are the risks?

To inform mitigation decisions, you must first understand the current state of affairs. We expand our breadth of knowledge by learning from people from all corners of the platform. We run “what could go wrong” (WCGW) exercises where anyone building or interested in infrastructure can highlight a risk. These can be technology risks, operational risks, or something else. Having this unfiltered list is a great way to get a broad understanding of what could happen.

The goal here is visibility.

2. What is worth mitigating?

Great brainstorming leaves us with a large and daunting list of risks. With limited time to fix things, the key is to prioritize what is most important to our business. To do this, we vote on risks, then gather technical experts to discuss highest ranked risks in more detail, including their likelihood and severity. We make decisions about what and how to mitigate, and which team will own each action item.

The goal here is to optimize how we spend our time.

3. Who makes what decisions?

In any organization, there are times when waiting for a perfect consensus is not possible or not effective. Shopify moves tremendously fast because we make sure to identify decision makers, then empower them to gather input, weigh risks/rewards, and come to a decision. Often the decision is best made by the subject matter expert or who bears the most benefit or repercussions of whatever direction we choose.

The goal here is to align incentives and accountability.

Our favorite resources about open source Git cheat sheet Advanced Linux commands cheat sheet Open source alternatives Free online course: RHEL technical overview Check out more cheat sheets 4. How do you communicate?

We move fast but still need to keep stakeholders and close collaborators informed. We summarize key findings and risks from our WCGW exercises so that we all land on the same page about our risk profile. This may include key risks or single points of failure. We over-communicate so that we’re aligned and aware and stakeholders have opportunities to interject.

The goal here is alignment and awareness.

Solving the right things when there is uncertainty

Underlying all these questions is the uncertainty in our working environment. You never have all the facts or know exactly which components will fail when and how. The best way to deal with uncertainty is by using probability.

Expert poker players know that great bets don’t always yield great outcomes, and bad bets don’t always yield bad outcomes. What’s important is to bet on the probability of outcomes, where over enough rounds, your results will converge to expectation. The same applies in engineering, where we constantly make bets and learn from them. Great bets require clearly distinguishing the quality of your decisions versus outcomes. It means not over-indexing on bad decisions that led to lucky outcomes or great decisions that happen to run into very unlucky scenarios.

Knowing that we can’t control everything also helps us stay calm, which is vital for us to practice good judgment in high-pressure situations.

When it comes to BFCM (and life in general), no one can predict the future or fully protect against all risks. The question is, what would you change looking back? In hindsight, would you feel confident that you prioritized the most important things and made thoughtful bets using the information available? Did you facilitate meaningful discussions with the right people? Could you justify your actions to your customers and their customers?

This article originally appeared on Planning in Bets: Risk Mitigation at Scale and is republished with permission.

What do you do with a finite amount of time to deal with an infinite number of things that can go wrong? 

Image by:

Opensource.com

Business DevOps SCaLE What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Pages