opensource.com

Updated: 1 hour 53 min ago

Make a temporary file on Linux with Bash

Mon, 06/27/2022 - 15:00

Make a temporary file on Linux with Bash Seth Kenlon Mon, 06/27/2022 - 03:00 Register or Login to like Register or Login to like

When programming in the Bash scripting language, you sometimes need to create a temporary file. For instance, you might need to have an intermediary file you can commit to disk so you can process it with another command. It's easy to create a file such as temp or anything ending in .tmp. However, those names are just as likely to be generated by some other process, so you could accidentally overwrite an existing temporary file. And besides that, you shouldn't have to expend mental effort coming up with names that seem unique. The mktemp command on Fedora-based systems and tempfile on Debian-based systems are specially designed to alleviate that burden by making it easy to create, use, and remove unique files.

Create a temporary file

Both mktemp and tempfile create a temporary file as their default action and print the name and location of the file as output:

$ tempfile
/tmp/fileR5dt6r

$ mktemp
/tmp/tmp.ojEfvMaJEp

Unless you specify a different path, the system places temporary files in the /tmp directory. For mktemp, use the -p option to specify a path:

$ mktemp -p ~/Demo
/home/tux/Demo/tmp.i8NuhzbEJN

For tempfile, use the --directory or -d option:

$ tempfile --directory ~/Demo/
/home/sek/Demo/fileIhg9aX

More Linux resources Linux commands cheat sheet Advanced Linux commands cheat sheet Free online course: RHEL technical overview Linux networking cheat sheet SELinux cheat sheet Linux common commands cheat sheet What are Linux containers? Our latest Linux articles Find your temporary file

The problem with using an auto-generated temporary file is that you have no way of knowing what its name is going to be. That's why both commands return the generated file name as output. You can use an interactive shell such as Konsole, GNOME Terminal, or rxvt to use the filename displayed on your terminal to interact with the file.

However, if you're writing a script, there's no way for you to intervene by reading the name of the file and using it in the following commands.

The authors of mktemp and tempfile thought of that problem, and there's an easy fix. The terminal sends output to a stream called stdout. You can capture stdout by setting a variable to the results of a command launched in a subshell:

$ TMPFILE=$(mktemp -p ~/Demo)

$ echo $TMPFILE
/home/tux/Demo/tmp.PjP3g6lCq1

Use $TMPFILE when referring to the file, and it's the same as interacting directly with the file itself.

Create a temporary directory with mktemp

You can also use the mktemp command to create a directory instead of a file:

$ mktemp --directory -p ~/Demo/
/home/tux/Demo/tmp.68ukbuluqI

$ file /home/tux/Demo/tmp.68ukbuluqI
/home/tux/Demo/tmp.68ukbuluqI: directoryCustomize temporary names

Sometimes you might want an element of predictability in even your pseudo-randomly generated filenames. You can customize the names of your temporary files with both commands.

With mktemp, you can add a suffix to your filename:

$ mktemp -p ~/Demo/ --suffix .mine
/home/tux/Demo/tmp.dufLYfwJLO.mine

With tempfile, you can set a prefix and a suffix:

$ tempfile --directory ~/Demo/ \
--prefix tt_ --suffix .mine
/home/tux/Demo/tt_0dfu5q.mineTempfile as touch

You can also set a custom name with tempfile:

$ tempfile --name not_random
not_random

When you use the --name option, it's absolute, ignoring all other forms of customization. In fact, it even ignores the --directory option:

$ tempfile --directory ~/Demo \
--prefix this_is_ --suffix .all \
--name not_random_at
not_random_at

In a way, tempfile can be a substitute for touch and test because it refuses to create a file that already exists:

$ tempfile --name example.txt
open: file exists

The tempfile command isn't installed on all Linux distributions by default, so you must ensure that it exists before you use it as a hack around test in a script.

Install mktemp and tempfile

GNU Core Utils includes the mktemp command. Major distributions include Core Utils by default (it's the same package that contains chmod, cut, du, and other essential commands).

The Debian Utils package includes the tempfile command and is installed by default on most Debian-based distributions and Slackware Linux.

Wrap up

Temporary files are convenient because there's no confusion about whether they're safe to delete. They're temporary, meant to be used as needed and discarded without a second thought. Use them when you need them, and clear them out when you're done.

The mktemp command on Fedora-based systems and tempfile on Debian-based systems are specially designed to alleviate that burden by making it easy to create, use, and remove unique files.

Image by:

Opensource.com

Linux Command line What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

What is distributed consensus for site reliability engineering?

Mon, 06/27/2022 - 15:00

What is distributed consensus for site reliability engineering? Robert Kimani Mon, 06/27/2022 - 03:00 1 reader likes this 1 reader likes this

In my previous article, I discussed how to enforce best practices within your infrastructure. A site reliability engineer (SRE) is responsible for reliability, first and foremost, and enforcing policies that help keep things running is essential.

Distributed consensus

With microservices, containers, and cloud native architectures, almost every application today is going to be a distributed application. Distributed consensus is a core technology that powers distributed systems.

Distributed consensus is a protocol for building reliable distributed systems. You cannot rely on "heartbeats" (signals from your hardware or software to indicate that they're operating normally) because network failures are inevitable.

There are some inherent problems to highlight when it comes to distributed systems. Hardware will fail. Nodes in a distributed system can randomly fail.

This is one of the important assumptions you have to make before you design a distributed system. Network outages are inevitable. You cannot always guarantee 100% network connectivity. Finally, you need a consistent view of any node within a distributed system.

According to the CAP theorem, a distributed system cannot simultaneously have all of these three properties:

Consistency: Consistent views of data at each node. This means that it is possible that you don't see the same data when viewed from 2 different nodes in a distributed system.
Availability: Refers to the availability of data at each node.
Partition tolerance: Refers to tolerance to network failures (which results in network partitions).

Therefore a node needs to have these qualities to function properly.

Over the years, several protocols have been developed in the area of distributed consensus, including Paxos, Raft, and Zab.

Paxos, for instance, was one of the original solutions to the distributed consensus problem. In the Paxos algorithm, nodes in a distributed system send a series of proposals with a unique sequence number. When the majority of processes in the distributed system accept the proposal, that proposal wins, and the sender generates a commit message. The key here is that the majority of the processes accept the proposal.

The strict sequence numbering of proposals is how it avoids duplication of data, and how it solves the problem of ordering.

More DevOps resources What is DevOps? The ultimate DevOps hiring guide DevOps monitoring tools guide A guide to implementing DevSecOps Download the DevOps glossary eBook: Ansible for DevOps Latest DevOps articles Open source distributed consensus

You don't have to reinvent the wheel by writing your own distributed consensus code. There are many open source implementations already available, such as the most popular one Zookeeper. Other implementations are Consul and e tcd.

Designing autoscaling

Autoscaling is a process by which the number of servers in a server farm are automatically increased or decreased based on the load. The term "server farm" is used here to refer to any pool of servers in a distributed system. These servers are commonly behind a load balancer, as described in my previous article.

There are numerous benefits to autoscaling, but here are the 4 major ones:

Reduce cost by running only the required servers. For instance, you can automatically remove servers from your pool when the load is relatively low.
Flexibility to run less time-sensitive workload during low traffic, which is another variation of automatically reducing the number of servers.
Automatically replace unhealthy servers (most cloud vendors provide this functionality).
Increase reliability and uptime of your services.

While there are numerous benefits, there are some inherent problems with autoscaling:

A dependent back-end server or a service can get overwhelmed when you automatically expand your pool of servers. The service that you depend on, for example, the remote service your application connects to, may not be aware of the autoscaling activity of your service.
Software bugs can trigger the autoscaler to expand the server farm abruptly. This is a dangerous situation that can happen in production systems. A configuration error, for instance, can cause the autoscaler to uncontrollably start new instances.
Load balancing may not be intelligent enough to consider new servers. For example, a newly added server to the pool usually requires a warm up period before it can actually receive traffic from the load balancer. When the load balancer isn't fully aware of this situation, it can inundate the new server before it's ready.

Autoscaling best practices

Scaling down is more sensitive and dangerous than scaling up. You must fully test all scale-down scenarios.

Ensure the back-end systems, such as your database, remote web service, and so on, or any external systems that your applications depend on can handle the increased load. You may be automatically adding new servers to your pool to handle increased load, but the remote service that your application depends on may not be aware of this.

You must configure an upper limit on the number of servers. This is important. You don't want the autoscaler to uncontrollably start new instances.

Have a "kill switch" you can use to easily stop the autoscaling process. If you hit a bug or configuration error that causes the autoscaler to behave erratically, you need a way to stop it.

3 systems that act in concert for successful autoscaling

There are three systems to consider for successful implementation of autoscaling:

LoadBalancing: One of the crucial benefits of load balancing is the ability to minimize latency by routing traffic to the location closest to the user.
LoadShedding: In order to accept all incoming requests, you only process the ones you can. Drop the excess traffic. Examples of load shedding systems are Netflix Zuul, and Envoy.
Autoscaling: Based on load, your infrastructure automatically scales up or down.

When you're designing your distributed applications, think through all the situations your applications might encounter. You should clearly document how load balancing, load shedding, and autoscaling work together to handle all situations.

Implementing effective health checks

The core job of load balancers is to direct traffic to a set of back-end servers. Load balancers need to know which servers are alive and healthy in order for it to successfully direct traffic to them. You can use health checks to determine which servers are healthy and can receive requests.

Here's what you need to learn about effective health checks:

Simple: Monitor for the availability of a back-end server.
Content verification: Send a small request to the back-end server and examine the response. For instance, you could look for a particular string or response code.
Failure: Your server may be up, but the application listening on a particular pod may be down. Or the pod may be listening, but it may not be accepting new connections. A health check must be intelligent enough to identify a problematic back-end server.

Health checks with sophisticated content verification can increase network traffic. Find the balance between a simple health check (a simple ping, for instance) and a sophisticated content-based health check.

In general, for a web application, hitting the home page of a web server and looking for a proper HTML response can serve as a reasonable health check. These kinds of checks can be automated using the curl command.

Whenever you are doing a postmortem analysis of an outage, review your health check policies and determine how fast your load balancer marked a server up or down. This can be very useful to determine your health check policies.

Stay healthy

Keeping your infrastructure healthy takes time and attention, but done correctly it's an automated process that keeps your systems running smoothly. There's yet more to an SRE's job to discuss, but those are topics for my next article.

Keeping your infrastructure healthy takes time and attention, but done correctly it's an automated process that keeps your systems running smoothly.

Image by:

Opensource.com

DevOps What to read next What you need to know about site reliability engineering How SREs can achieve effective incident response A site reliability engineer's guide to change management How to use a circuit breaker pattern for site reliability engineering This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

An open source project that opens the internet for all

Sun, 06/26/2022 - 15:00

An open source project that opens the internet for all Blake Bertuccelli Sun, 06/26/2022 - 03:00 1 reader likes this 1 reader likes this

Accessibility is key to promoting an open society.

We learn online. We bank online. Political movements are won and lost online. Most importantly, the information we access online inspires us to make a better world. When we ignore accessibility requirements, people born without sight or who lost limbs in war are restricted from online information that others enjoy.

We must ensure that everyone has access to the open internet, and I am doing my part to work toward that goal by building Equalify.

What is Equalify?

Equalify is "the accessibility platform."

The platform allows users to run multiple accessibility scans on thousands of websites. With our latest version, users can also filter millions of alerts to create a dashboard of statistics that are meaningful to them.

The project is just getting started. Equalify aims to open source all the premium features that expensive services like SiteImprove provide. With better tools, we can ensure that the internet is more accessible and our society is more open.

How do we judge website accessibility?

W3C's Web Accessibility publishes the Web Content Accessibility Guideline (WCAG) report that sets standards for accessibility. Equalify, and others, including the US Federal Government, use WCAG to meter website accessibility. The more websites we scan, the more we can understand the shortcomings and potentials of WCAG standards.

How do I use Equalify?

Take a few minutes to browse our GitHub and learn more about the product. Specifically, the README provides steps on how to begin supporting and using Equalify.

Our goal

Our ultimate goal is to make the open internet more accessible. 96.8% of homepages do not meet WCAG guidelines, according to The WebAIM Million. As more people build and use Equalify, we combat inaccessible pages. Everyone deserves equal access to the open internet. Equalify is working toward that goal as we work toward building a stronger and more open society for all.

Equalify is an open source project with the goal of making the open internet more accessible.

Image by:

"Plumeria (Frangipani)" by Bernard Spragg is marked with CC0 1.0.

Accessibility What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Free RPG Day: Create maps for your Dungeons & Dragons game with Mipui

Sat, 06/25/2022 - 15:00

Free RPG Day: Create maps for your Dungeons & Dragons game with Mipui Seth Kenlon Sat, 06/25/2022 - 03:00 Register or Login to like Register or Login to like

It's Free RPG Day again, and there's no better to play a free roleplaying game than with free and open source software. In this digital era of pen-and-paper gaming, it's still relatively unusual for adventures to include digital maps. In fact, it's also unusual for paper adventures to include maps that are sized correctly for miniatures, and many that do have colourful and richly textured maps that look great in a glossy book but look murky when photocopied and enlarged for the tabletop. Long story short: a tabletop gamer is often in need of a quick and convenient way to produce maps. Mipui is an open source web app that enables you to create grid-based maps for role-playing games, and it works great for virtual and physical tabletops alike.

Image by:

Opensource.com

No installation required

Mipui is primarily a hosted web app. If you use it and enjoy it, you can make a donation to help with operational costs at Mipui's ko-fi page.

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources Installation

Alternately, you can install your own Mipui instance, because it's distributed with an MIT license. It's a Node.js application, and as long as you've got experience running a web server, it's a pretty simple installation process. I run mine on a computer I rescued from the tip for use as a home server. The exact package names may vary depending on your distribution, but here are the steps I use.

First, install Git, a web server, and Node.js. Then use Git to download Mipui to the web root directory. Web root directories vary depending on your web server defaults, so verify the path if you're unsure where your web server looks for files.

$ sudo dnf install git nginx nodejs
$ sudo git clone https://github.com/amishne/mipui.git \
/usr/share/nginx/html/

Next, install the required Node packages with npm:

$ npm install firebase eslint

Finally, start the web server and open a port in your firewall to permit HTTP traffic:

$ sudo firewall-cmd --add-service http
$ sudo systemctl enable --now nginx

Navigate to /mipui/public/app/index.html on your server's address. For instance, if your server's hostname is example.local then the address is example.local/mipui/public/app/index.html.

As of this writing, Mipui is rapidly gaining tools that turns it into a full-fledged interactive tabletop, complete with digital miniatures in the form of tokens, line of sight calculation, and fog-of-war. However, I currently use Mipui for cartography.

Drawing a room

The top toolbar of Mipui provide general categories of tasks. When you click on one, a toolbox appears below the first with further options.

Image by:

Opensource.com

You can set the theme you want applied to your maps at any time. My favourite is Cross Hatch (with grid), but there are several other themes available within the Map toolbox.

To start drawing, click the Walls button. Select the Rectangle Room tool from the toolbox, hover over the grid space you want the room to start, and then click and hold as you drag the rectangle to the desire size. You can divide the room into smaller sections with the same tool. Walls that overlap are merged into a continuous shape.

You can also add single walls with Wall (auto) and Wall (manual) tools.

Image by:

Opensource.com

You can add other features, such as doors, secret doors, windows, and bars using the Separators tool.

You can add arbitrary features, like furniture, crates, treasure chests, and so on, using the Shapes tool.

For stairs, ramps, underground passages, and pits, use the Elevation tool.

Image by:

Opensource.com

Open source game maps

There are lots more features in Mipui, including tools for the Game Master to conceal secret doors and text, and to obscure and reveal parts of the map. There's a library of tokens to serve as player characters, NPCs, and monsters. You can share a map in either read-only mode or editable mode, you can save a map to local storage (and import it again later), or export a map as an image for printing or for uploading to a virtual tabletop like Mythic Table.

I've been able to reimplement maps from published adventures quickly and easily with Mipui. The simple UI appeals to me, and the simple themes for the maps let the maps define the space but ensure that the players rely on imagination to fill the space in. Try Mipui the next time you need a map for your game.

This open source mapping tool creates beautiful maps for your tabletop games.

Image by:

Opensource.com

Gaming What to read next 5 games for hosting your own Free RPG Day Try a new game on Free RPG Day This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How I sketchnote with open source tools

Fri, 06/24/2022 - 15:00

How I sketchnote with open source tools Amrita Sakthivel Fri, 06/24/2022 - 03:00 3 readers like this 3 readers like this

Sketchnoting, also called visual notetaking, is a method of taking notes using illustrations, symbols, graphic layouts, and text. It's meant to be a creative and engaging way to record your thoughts. It can work well in your personal life as well as in your work life. You don't need to be an artist to create a sketchnote, but you do need to listen, and visually combine and summarize ideas through text and drawings.

Why sketchnotes?

Here are some interesting facts about why visual aids are so helpful:

The picture superiority effect is when people remember and retain pictures and images more than just plain old words.
Writing is complicated and takes a long time to get good at.
Visual information can be processed 60,000 times faster than text. This is one reason that iconography is so prevalent and has been throughout history.
Researchers state that people remember only 20% of what they read, whilst 37% are visual learners.

An example of how visual aids and regular text process in the mind

A SMART goal is used to help guide goal setting. SMART is an acronym that stands for Specific, Measurable, Achievable, Realistic, and Timely. Therefore, a SMART goal incorporates all of these criteria to help focus your efforts and increase the chances of achieving your goal.

Image by:

(Amrita Sakthivel, CC BY-SA 40)

Which form of information did you retain more easily? Was it the visual or the text version that held your attention more? What does this say about how you process information?

4 open source sketchnote tools

A sketchnote is just a drawing, so you don't need any special application to start making them yourself. However, if you're an avid digital creator, you might want to try these open source illustration applications:

Mypaint: Simple and elegant. It's you, your stylus, and a blank canvas.
Krita: You, your stylus, a blank canvas, and an art supply store.
Inkscape: Grab some clip art or create your own and let the layout begin.
Drawpile: Make collaborative sketchnotes.

How I use sketchnotes

I recently contributed to a presentation about customer support and Knowledge-centered support (KCS) analysis. I did two versions:

Image by:

(Amrita Sakthivel, CC BY-SA 40)

Image by:

(Amrita Sakthivel, CC BY-SA 40)

I created a sketchnote to demonstrate the differences between OpenShift and Kubernetes.

Image by:

(Amrita Sakthivel, CC BY-SA 40)

As a technical writer, my objective is to write documentation from a user perspective so that the user gets the optional usage of the product or feature.

Image by:

(Amrita Sakthivel, CC BY-SA 40)

How to best convey information through a visual medium

Plan what you want to convey.
Decide the structure you want to use for for the sketchnote.
Start with the text first, then add icons and visuals.
Use color combinations to help show the content effectively.

Sometimes, using plain text to convey easy concepts can be inelegant in comparison to a simple visual aid. Visual aids are an easier and faster way to display information to your audience. They can also be helpful when taking your own notes. Give it a try.

More resources

Sketchnoting, or visual notetaking, is a method of taking notes using illustrations, symbols, graphic layouts, and text. Here's why I love sketchnotes and you should too.

Image by:

Opensource.com

Art and design Tools What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Create a more diverse and equitable open source project with open standards

Fri, 06/24/2022 - 15:00

Create a more diverse and equitable open source project with open standards Paloma Oliveira Fri, 06/24/2022 - 03:00 1 reader likes this 1 reader likes this

This article is intended to serve as a reference so that you can understand everything you need to be proud of your repository and make your open source project more open. By using open standards, an open source project improves its quality and shareability, since such standards exist to foster better communication between creators and consumers of the project. Most importantly, open standards can guide technology development by gently enforcing space for diversity and equity.

What is open source?

The term open source started in the late 80's as a way to guarantee access to technological development by legally guaranteeing the right to copy, modify, and redistribute software. This idea has expanded and today it is about fostering a culture of sharing that supports everything from political actions to a billion dollar technology industry.

The projects and their communities, which give the projects their value, have become much more complex than just the code. Today, it is impossible to think of a project outside of what I prefer to define as its ecosystem. "Ecosystem" sounds to me like a proper definition, because it acknowledges the complexity of technical things, like code and configuration, and also of people.

Lack of diversity is a problem in open source

Without open source, the technology industry would collapse, or it wouldn't even exist. That's the scope of importance that open source has today. What a powerful feeling it is to know that we are all "standing on the shoulders of giants"? We are all benefiting from the power of the commons, using collective labor and intelligence to make something better for everyone.

What's rarely spoken of is that such important initiatives, in most cases, depend solely on the volunteer labor of its maintainers. This creates a huge imbalance, both from work and diversity aspects.

Open source is intrinsically a power to foster diversity within the development industry by valuing the contributions of what is contributed over who is contributing it. The reality is, though, that free time is often a rare commodity for many people. Many people are too busy working to generate income, caring for families and loved ones, looking for work, fighting social injustice, and are unable to dedicate time to contribute to software.

The very opportunity to contribute to the system depends on you being one of the lucky ones who can be part of this system. This is not a reality for many others because of their gender, skin color, or social status. Historically, women accumulate unpaid work that's invisible, but which requires a substantial proportion of their energy and time. Underprivileged people have little free time because they have to work more hours, often having more than one job.

This is reflected in the numbers. Only 4.5% of open source maintainers are not white males, according to research into the field. So we know that this billion dollar industry, shaping technological development, is composed of a homogeneous environment. But we also know that diversity renders robust innovative results.The question is, how can this be changed?

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources Intentional communication with your open source community

Communication is key. Build a structure with transparency of communication and governance for your project. Clear, concise and respectful communication makes your project accessible to users and contributors. It helps project maintainers devote their time focusing on what they need to do. It helps interested people feel welcome and start contributing faster and more consistently, and it attracts diversity to your community.

Sounds great, but how can this be obtained? I grouped the rules of good practice into three categories procedural, daily, and long term. These practices are in part strategic, but if you and your community don't have the capacity to be strategic, it's also possible to substantially change your project by adding a few simple files to your repository.

But which files are those, and what happens when you already have several projects under your management? A few of them are:

Code of conduct
License
Readme
Changelog
Contributing
Ownership
Test directory
Issues
Pull request templates
Security
Support

To help you get started, there are many projects that offer templates. By simply cloning them, you create a repository with these documents.

Another tool, designed to help open source software (OSS) maintainers and open source program offices (OSPO) is check-my-repo. Created by us at Sauce Labs' OSPO Community, it's an automated tool built on Repolinter that verifies whether the main necessary parameters to comply with open source best practices (including the files mentioned above and a few other rules), are present in your repositories. The web app also explains why each file needs to exist.

Procedural best practices

As the name implies, this is about the process:

Maintain a single public issue tracker.
Allow open access to the issues identified by the project.
Have mechanisms for feedback and to discuss new features.
Offer public meeting spaces scheduled in advance and have them recorded.

Here are some files that relate to the procedural logic:

README: Make it easier for anyone who lands on your project to get started.
Code of conduct: Establish expectations and facilitate a healthy and constructive community.
Ownership: Make sure that someone is put in charge of the project to prevent it from being forgotten.

Daily tasks

This is about the day-to-day aspects, including:

Check the status of the project.
Explain how to submit issues, propose enhancements, and add new features.
Show how to contribute to the project.

Files related to the daily aspects of project management are:

Contributing: A step by step guideline on how to contribute.
Changelog: Notable changes need to be logged.
Security: Show how to report a security vulnerability.
Support: How the project is being maintained.

Long term goals

This has information that guarantees the history and continuation of the project, such as a mission statement, key concepts and goals, a list of features and requirements, and a project roadmap.

Relevant files are:

License: It's essential for users to know their limits, and for you to protect yourself legally.
Test directory: Use this to avoid regression, breaks, and many other issues.

Creating and maintaining your open source project

Now imagine a project with all of these factors. Will it help you build and keep a community? Will noise in communication be mitigated? Will it save maintainers tons of time so they can onboard people and solve issues? Will people feel welcome?

Creating and maintaining an open source project is very rewarding. Creating collaboratively is an incredible experience and has the intrinsic potential to take such creation into possibilities that one person alone, or a small group could never achieve. But working openly and collaboratively is also a challenge for the maintainers and a responsibility for the community to ensure that this space is equitable and diverse.

There's a lot of work ahead. The result of surveys on the health of open source communities often reflect the worst of the technology industry. That's why ensuring that these standards are used is so important. To help mitigate this situation, I am betting on standards. They're a powerful tool to align our intentions and to guide us to an equitable, transparent, and shareable space. Will you join me?

Using open standards improves your project's quality and shareability. Most importantly, they can guide technology development by gently enforcing space for diversity and equity.

Image by:

Monsterkoi. Modified by Opensource.com. CC BY-SA 4.0

Diversity and inclusion Community management What to read next A beginner's guide to cloud-native open source communities This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Applying smart thinking to open organization principles

Thu, 06/23/2022 - 15:00

Applying smart thinking to open organization principles Ron McFarland Thu, 06/23/2022 - 03:00 1 reader likes this 1 reader likes this

Habits have an unfair reputation of being bad things that need to be stopped. But consider this: Automation and habits have many things in common. First, they're both repetitive. They're done over and over, the exact same way. More importantly, they reduce costs (both cost per piece and cost of energy used).

Art Markman's book Smart Thinking stresses the importance of creating habits to take your mind away from unimportant thoughts and redirect your attention toward more interesting and creative ideas. You can apply this concept to the use of open organization principles.

This article is the second part of a two-part series on habits. The first part drew from a presentation I gave a few years ago. In this article, I apply new information from Markman.

What is smart thinking?

Smart thinking is about the information you have, how you get valuable information, and how you use what you have successfully. It's also the ability to solve new problems using your current knowledge. Smart thinking is a skill you can use, practice, and improve on. Based on that definition, those benefits would be very helpful in promoting open organization principles.

Forming habits

Markman says habits are formed, unthinkingly automatic, and rarely reviewed for their importance. A process of strategic thinking is required to do this review. If performed correctly, less effort is needed to change these habits. The human mental system is designed to avoid thinking when possible. It's preferable to just effortlessly act on routine tasks so you can direct your attention to more important issues. When you perform tasks repeatedly, you form habits so your mind doesn't have to think through them step by step.

That's why you can't remember things you did a few seconds ago. Markman gives the example of not remembering whether you locked the door of your apartment or house. Because you did it automatically, if you were thinking of something completely different when you locked the door, you likely don't know whether you locked it.

The goal of smart thinking is to develop as many good habits as possible, so your mind doesn't have to go through the effort of thinking every time you perform them. Developing checklists is a good way to avoid excessive thinking. This can be applied to open organization principles as well. To save on energy and effort, follow a list of items and act on them. This was my goal when I developed the Communication technology maturity worksheet, which I wrote about in my article "How to assess your organization's technical maturity." If there's no routine and the activity is new, you must resort to a more stressful and tiring thought process.

Importance of environment and repetition

Markman describes a formula for creating smart habits, considering two key facts:

There is interaction between environment and action.
Only work on repeated actions to create good habits.

Environment

Markman defines an environment as both the outside world and one's internal mental world (such as goals, feelings, or perspectives). In the outside world, you can develop seemingly meaningless activities to create triggers and reminders. For example, you assign convenient locations for things, so you don't have to think about where they are. You can create triggers (such as lists or empty packages for resupply reminders) to highlight things you need to do, thereby avoiding a lot of thought.

If you want to read a book but have been putting it off, consider taking it out and putting it in front of your computer or TV. That way, you'll see it, and probably have to move it out of your way, each time you're at your computer or TV, which can be a powerful way to get you reading.

If you learn something in a particular place (with all its surroundings), you will recall that learning more easily if you are in a similar place or the same place. The environment triggers the information to come to the top of your mind. Because new (wanted) habits must be consciously performed, you must add elements in your environment to remind yourself to do them. Also, you must eliminate triggers that remind you of the old (unwanted) habit and replace them with another activity.

For the internal mental environment, consider trying to memorize useful facts that are continually required, such as names, locations, the best road routes, and so on. If I meet someone I want to get to know better, I ask for their name and use it as much as possible, then write it down when I have a chance. Later, I add that name to my personal contact information. By repeatedly seeing the name, I eventually memorize it and use it without thinking.

Repetition

All habits are formed through repeated use, and the first time is the hardest. After many repetitions, minimal thought is required, until the habit becomes automatic. Structure your daily life by forming beneficial habits and practicing them. Regular role playing is equally important. By getting those memories top of mind, they become faster to retrieve.

If you can make each habit as distinct as possible, it will be easier to remember and retrieve when needed. You know you've created a good habit when you no longer have to think about it.

When good habits go bad

A habit can go from good to bad with a situational change. Getting rid of a bad habit and creating a productive new one takes a lot of thought. When a habit change is required, regularly getting a good night's sleep is essential. Your brain helps process the new experiences and behaviors and makes the activity more effective in the future.

There are two important characteristics of changing habits:

Your current habit pushes you into action. You must overcome the pressure to act.
When there's pressure to act, replace the habit with a more helpful action. New memories of the desired action must be developed and stored.

You must develop triggers to engage the new behavior. This trigger requires your attention, as you are forcing yourself to stop the strong current habit and develop a new and, initially, weak habit. If you are distracted, the old habit remains strong. Your effort could also be disrupted by stress, fatigue, or the powerful influence of your environment.

Willpower alone cannot change habits, and it's extremely exhausting to try. You might have some short-term successes, but you will wear out eventually. You need environmental support. Cravings happen when you have an active goal that has been blocked. They are your mind's way of saying you must strengthen your desired goal further. You must not only stop the old habit but study what in your environment is activating it.

To apply these concepts to open organization principles, consider how you can create an environment that makes it easy to activate a principle while discouraging the current unproductive behavior. You must create an environment that reduces the chance of the bad habit being recalled and create triggers to activate the principles. That's the goal of the open organization leadership assessments. They become triggers.

Learn about open organizations Download resources Join the community What is an open organization? How open is your organization? The Role of 3

One of Markman's key concepts is the Role of 3, which guides both effective learning and effective presentation of information in a way that will be remembered. Markman is talking about getting someone else to change his behavior. This reminds me of my sales seminar in which I wanted to improve (change) the sales associates' selling activities. There is a difference between education, which is just providing knowledge, and training. Training is developing a skill one can use and improve on. Therefore, I had to cut three-quarters of the content and repeat it four times in different ways in my seminars, including role playing. I think what Markman is saying is that stressing just three actions is ideal initially. In my sales training case, my greatest success was role playing product presentations. Not only did the sales people learn the features of the product, but they could talk about them confidently to customers. Afterward, that led to other skills that could be developed.

This Role of 3 applies to you too when wanting to act on something. Imagine a critical business meeting with vital information presented. Suppose you were distracted by jokes or anxiety-provoking rumors (about layoffs, demotions, or inadequate performance). The meeting could be a waste of your time unless you manage your attention on specific issues.

Here is what Markman recommends before, during, and after an important event, to learn three key points:

Prepare: Consider what you want to get out of the event (be it a meeting, presentation, or one-on-one discussion) and what you want to achieve. Consider at most three things. Information connected to your goal is easier to detect and remember. Your mental preparation directs your mind to the information you seek. Psychologists call this advance organizer activity. Consider what will be discussed and review any available materials beforehand.
Pay attention: This is hard work, particularly when there are distractions. Avoid multitasking on unrelated work (including emails and instant messaging). Working memory is an important concept: Your attention impacts what you will remember later.
Review: After any critical information-gathering event (a class, meeting, reading a book or article), write down three key points. If you can't write it down, review it in your mind, or record it on your phone or digital recorder. Try to attach what you learned to what you know. Where do they match or conflict?

Suppose you're presenting open organization principles to a group. To make sure they remember those principles and use them, follow these procedures:

Start presentations with an agenda and outline for all to review.
During the presentation, stay focused primarily on your three critical points. Present all principles, but continually review and highlight three of them.
Help people connect the content to what they are doing now and how the principles will improve their activities.
Share additional ways the principles can be applied with example use cases and their benefits.
End all presentations with a three-point summary. An action plan can be helpful as well.

Say you're presenting the five open organization principles of transparency, collaboration, inclusivity, adaptability, and community. First, you present all of them in basic terms, but the Role of 3 tells you the audience will probably only pick up on three of them, so you focus on transparency, collaboration, and inclusivity.

Here's an example outline, following the Role of 3:

Present three key ideas.
1. Transparency
2. Collaboration
3. Inclusivity
Organize the presentation by describing each key idea in one sentence.
1. Transparency is getting as much useful information to other people as possible.
2. Collaboration is discussing issues with as many other people as appropriate and avoiding making final decisions on your own.
3. Inclusivity is getting as many different kinds of people involved in a project as appropriate.
Develop key idea relationships.
1. How is transparency related to what people already know? The more transparent you are, the easier it is to collaborate.
2. How is collaboration related to what people already know, including transparency? If people are doing well with transparency, mention many times that by collaborating more, their transparency will improve.
3. How is inclusivity related to what people already know, including transparency and inclusivity? The more inclusive we are, the more diverse the ideas generated, leading to more transparency and collaboration.
Provide three questions to help people remember and explain the ideas.
1. When was the last time you paused before deciding something and decided instead to collaborate with all people concerned?
2. Who do you think would be an easy person to collaborate with?
3. Each week, how many people do you collaborate with? Should that number be improved upon?

Better for everyone

Now you know what smart thinking is all about. You've learned the Role of 3, and you understand the importance of environment and repetition. These concepts can help you be more productive and happier in whatever you get involved in, including bringing the open organization principles to your own communities.

Use your environment, repetition, and the Role of 3 to enhance your thinking habits.

Image by:

Opensource.com

The Open Organization What to read next Using habits to practice open organization principles This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How to use a circuit breaker pattern for site reliability engineering

Thu, 06/23/2022 - 15:00

How to use a circuit breaker pattern for site reliability engineering Robert Kimani Thu, 06/23/2022 - 03:00 Register or Login to like Register or Login to like

Distributed systems have unique requirements for the site reliability engineer (SRE). One of the best ways to maintain site reliability is to enforce a certain set of best practices. These act as guidelines for configuring infrastructure and policies.

What is a circuit breaker pattern?

A circuit breaker pattern saves your service from halting or crashing when another service is not responding.

Image by:

(Robert Kimani, CC BY-SA 40)

For instance, in the example image, the business logic microservices talk to the User Auth service, Account service, and Gateway service.

As you can see, the "engine" of this business application is the business logic microservice. There's a pitfall, though. Suppose the account service breaks. Maybe it runs out of memory, or maybe a back end database goes down. When this happens, the calling microservice starts to have back pressure until it eventually breaks. This results in poor performance of your web services, which causes other components to break. A circuit breaker pattern must be implemented to rescue your distributed systems from this issue.

Implementing a circuit breaker pattern

When a remote service fails, the ideal scenario is for it to fail fast. Instead of bringing down the entire application, run your application with reduced functionality when it loses contact with something it relies upon. For example, keep your application online, but sacrifice the availability of its account service. Making a design like a circuit breaker helps avoids cascading failure.

Here's the recipe for a circuit breaker design:

Track the number of failures encountered while calling a remote service.
Fail (open the circuit) when a pre-defined count or timeout is reached.
Wait for a pre-defined time again and retry connecting to the remote service.
Upon successful connection, close the circuit (meaning you can re-establish connectivity to the remote service.) If the service keeps failing, however, restart the counter.

Instead of hanging on to a poorly performing remote service, you fail fast so you can move on with other operations of your application.

There are very specific components provided by open source to enable circuit breaker logic in your infrastructure. First, use a proxy layer, such as Istio. Istio is a technology independent solution and uses a "Blackbox" approach (meaning that it sits outside your application).

Alternately, you can use Hystrix, an open source library from Netflix that's widely and successfully used by many application teams. Hystrix gets built into your code. It's generally accepted that Hystrix can provide more functionality and features than Istio, but it must be "hard coded" into your application. Whether that matters to your application depends on how you manage your code base and what features you actually require.

Implementing effective load balancing

Load balancing is the process of distributing a set of tasks over a set of resources (computing units), with the aim of making their overall processing as efficient as possible. Load balancing can optimize response time, and avoid unevenly overloading some computer nodes while other computer nodes are left idle.

Image by:

(Robert Kimani, CC BY-SA 40)

Users on the left side talk to a virtual IP (vIP) on a load balancer. The load balancer distributes tasks across a pool of systems. Users are not aware of the back-end system, and only interact with the system through the load balancer.

Load balancing helps to ensure high availability (HA) and good performance. High availability means that failures within the pool of servers can be tolerated, such that your services remain available a high percentage of the time. By adding more servers to your pool, effective performance is increased by horizontally spreading the load (this is called horizontal scaling.)

3 ways to do load balancing

DNS Load Balancing: This is the most straight forward and is free. You do not have to purchase any specialized hardware because it's the basic functionality of DNS protocol, you'd simply return multiple A-records An A-record maps a hostname to an IP address. When the DNS server returns multiple IP addresses for a single hostname, the client will choose one IP address at random, in other words the load balancing is automatically done for you from the client.
Dedicated Hardware-Based Load Balancer: This is what is typically used within a data center, it's feature-rich and highly performant.
When you have a Software Load Balancer, you do not need to have a dedicated hardware for this type of load balancing, you can simply install the load balancer software on commodity hardware and use it for your load balancing use cases.

Nginx and HAProxy are mostly used as software load balancers and can be very useful in development environments where you can quickly spin up their functionality.

DNS load balancing

A DNS server may return multiple IP addresses when a client looks up a hostname. This is because in a large-scale deployment, a network target isn't just running on one server or even just on one cluster. Generally, the client software chooses an IP address at random from the addresses returned. It's not easy to control which IP a client chooses.

Unless you have custom software built to determine the health of a server, pure DNS load balancing doesn't know whether a server is up or down, so a client could be sent to a server that is down.

Clients also cache DNS entries by design, and it's not always easy to clear them. Many vendors offer the ability to choose which datacenter a client gets routed to based on geographic location. Ultimately, an important way to influence how clients reach your services is through load balancing.

Dedicated load balancers

Load balancing can happen at Layer3 (Network), Layer4 (Transport), or Layer7 (Application) of the OSI (Open Systems Interconnection) model.

An L3 load balancer operates on the source and destination IP addresses.
An L4 load balancer works with IP and port numbers (using the TCP protocol).
An L7 load balancer operates at the final layer, the application layer, by making use of the entire payload. It makes use of the entire application data, including the HTTP URL, cookies, headers, and so on. It's able to provide rich functionality for routing. For instance, you can send a request to a particular service based on the type of data that's coming in. For example, when a video file is requested, an L7 load balancer can send that request to a streaming appliance.

There are 5 common load balancing methods in use today:

Round Robin: This is the simplest and most common. The targets in a load balancing pool are rotated in a loop.
Weighted Round Robin: Like a round robin, but with manually-assigned importance for the back-end targets.
Least Loaded or Least Connections: Requests are routed based on the load of the targets reported by the target back-end.
Least Loaded with Slow Start or Least Response Time: Targets are chosen based on its load, and requests are gradually increased over time to prevent flooding the least loaded back-end.
Utilization Limit: Based on queries per second (QPS), or events per second (EPS), reported by the back-end.

There are software-based load balancers. This can be software running on commodity hardware.You can pick up any Linux box and install HAProxy or Nginx on it. Software load balancers tend to evolve quickly, as software often does, and is relatively inexpensive.

You can also use hardware-based, purpose-built load balancer appliances. These can be expensive, but they may offer unique features for specialized markets.

Transitioning to Canary-based Deployment

In my previous article, I explained how Canary-based deployments help the SRE ensure smooth upgrades and transitions. The exact method you use for a Canary-based rollout depends on your requirements and environment.

Generally, you release changes on one server first, just to see how the deployed package fares. What you are looking for at this point is to ensure that there are no catastrophic failures, for example, a server doesn't come up or the server crashes after it comes up.

Once that one server deployment is successful, you want to proceed to install it in up to 1% of the servers. Again this 1% is a generic guideline but it really depends on the number of servers you have and your environment. You can then proceed to release for early adaptors if it's applicable in your environment.

Finally, once canary testing is complete, you'd want to release for all users at that point.

Keep in mind that a failed canary is a serious issue. Analyze and implement robust testing processes so that issues like this can be caught during testing phase and not during canary testing.

Reliability matters

Reliability is key, so utilize a circuit breaker pattern to fail fast in a distributed system. You can use Hystrix libraries or Istio.

Design load balancing with a mix of DNS and dedicated load balancers, especially if your application is web-scale. DNS load balancing may not be fully reliable, primarily because the DNS servers may not know which back-end servers are healthy or even up and running.

Finally, you must use canary releases, but remember that a canary is a not a replacement for testing.

Learn what a circuit breaker pattern is, how to implement it, and how to do load balancing.

Image by:

Opensource.com

What to read next What you need to know about site reliability engineering How SREs can achieve effective incident response A site reliability engineer's guide to change management This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Manage your Rust toolchain using rustup

Wed, 06/22/2022 - 15:00

Manage your Rust toolchain using rustup Gaurav Kamathe Wed, 06/22/2022 - 03:00 2 readers like this 2 readers like this

The Rust programming language is becoming increasingly popular these days, used and loved by hobbyists and corporations alike. One of the reasons for its popularity is the amazing tooling that Rust provides making it a joy to use for developers. Rustup is the official tool used to manage Rust tooling. Not only can it be used to install Rust and keep it updated, it also allows you to seamlessly switch between the stable, beta, and nightly Rust compilers and tooling. This article will introduce you to rustup and some common commands to use.

Default Rust installation method

If you want to install Rust on Linux, you can use your package manager. On Fedora or CentOS Stream you can use this, for example:

$ sudo dnf install rust cargo

This provides a stable version of the Rust toolchain, and works great if you are a beginner to Rust and want to try compiling and running simple programs. However, because Rust is a new programming language it changes fast and a lot of new features are frequently added. These features are part of the nightly and later beta version of the Rust toolchain. To try out these features you need to install these newer versions of the toolchain, without affecting the stable version on the system. Unfortunately, your distro’s package manager can’t help you here.

Installing Rust toolchain using rustup

To get around the above issues, you can download an install script:

$ curl --proto '=https' --tlsv1.2 \
-sSf https://sh.rustup.rs > sh.rustup.rs

Inspect it, and then run it. It doesn’t require root privileges and installs Rust accordingly to your local user privileges:

$ file sh.rustup.rs
sh.rustup.rs: POSIX shell script, ASCII text executable
$ less sh.rustup.rs
$ bash sh.rustup.rs

Select option 1 when prompted:

1) Proceed with installation (default)
2) Customize installation
3) Cancel installation
> 1

After installation, you must source the environment variables to ensure that the rustup command is immediately available for you to use:

$ source $HOME/.cargo/env

Verify that the Rust compiler (rustc) and Rust package manager (cargo) are installed:

$ rustc --version
$ cargo --version

Programming and development Red Hat Developers Blog Programming cheat sheets Try for free: Red Hat Learning Subscription eBook: An introduction to programming with Bash Bash shell scripting cheat sheet eBook: Modernizing Enterprise Java See installed and active toolchains

You can view the different toolchains that were installed and which one is the active one using the following command:

$ rustup showSwitch between toolchains

You can view the default toolchain and change it as required. If you’re currently on a stable toolchain and wish to try out a newly introduced feature that is available in the nightly version you can easily switch to the nightly toolchain:

$ rustup default
$ rustup default nightly

To see the exact path of the compiler and package manager of Rust:

$ rustup which rustc
$ rustup which cargoChecking and Updating the toolchain

To check whether a new Rust toolchain is available:

$ rustup check

Suppose a new version of Rust is released with some interesting features, and you want to get the latest version of Rust. You can do that with the update subcommand:

$ rustup updateHelp and documentation

The above commands are more than sufficient for day-to-day use. Nonetheless, rustup has a variety of commands and you can refer to the help section for additional details:

$ rustup --help

Rustup has an entire book on GitHub that you can use as a reference. All the Rust documentation is installed on your local system, which does not require you to be connected to the Internet. You can access the local documentation which includes the book, standard library, and so on:

$ rustup doc
$ rustup doc --book
$ rustup doc --std
$ rustup doc --cargo

Rust is an exciting language under active development. If you’re interested in where programming is headed, keep up with Rust!

Rustup can be used to install Rust and keep it updated. It also allows you to seamlessly switch between the stable, beta, and nightly Rust compilers and tooling.

Image by:

Opensource.com

Rust Programming What to read next 5 Rust tools worth trying on the Linux command line 5 signs you might be a Rust programmer This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

A site reliability engineer's guide to change management

Wed, 06/22/2022 - 15:00

A site reliability engineer's guide to change management Robert Kimani Wed, 06/22/2022 - 03:00 2 readers like this 2 readers like this

In my previous article, I wrote about incident management (IM), an important component of site reliability engineering. In this article, I focus on change management (CM). Why is there a need to manage change? Why not simply just have a free-for-all where anyone can make a change at any time?

There are three tenets of effective CM. This gives you a forecast framework for your CM strategy:

Rolling out changes progressively: There's a difference between progressive rollouts in which you deploy changes in stages, and doing it all at once. You get to find out that even though progressive change may look good on paper,there are pitfalls to avoid.
Detecting problems with changes: Monitoring is extremely critical for your CM to work. I discuss and look at examples of how to setup effective monitoring to ensure that you can detect problems and make changes as quickly as possible.
Rollback procedures: How can you effectively rollback when things go wrong?

Why manage change?

It's estimated that 75% of production outages are due to changes. Scheduled and approved changes that we all perform. This number is staggering and only requires you to get on top of CM to ensure that everything is in order before the change is attempted. The primary reason for these staggering numbers is that there are inherent problems with changes.

Infrastructure and platforms are rapidly evolving. Not so long ago, infrastructure was not as complex, and it was easy to manage. For example an organization could have a few servers, where they ran an application server, web-servers, and database servers. But lately the infrastructure and platform are as complex as ever.

It is impossible to analyze every interconnection and dependency after the fact caused by the numerous sub-systems involved. For instance an application owner may not even know a dependency of an external service until it actually breaks. Even if the application team is aware of the dependency, they may not know all of the intricacies and all the different ways the remote service will respond due to their change.

You cannot possibly test for unknown scenarios. This goes back to the complexity of the current infrastructure and platforms. It will be cost prohibitive in terms of the time you spend to test each and every scenario before you actually apply a change. Whenever you make a change in your existing production environment, whether it's a configuration change or a code change, the truth is that, you are at high risk of creating an outage. So how do we handle this problem? Let's take a peek at the three tenets of an effective CM system.

Automation is the foundational aspect of effective CM. Automation flows across the entire process of CM. This involves a few things:

Progressive rollouts: Instead of doing one big change, the progressive rollouts mechanism allows you to implement change in stages, thereby reducing the impact to the user-base if something goes wrong. This attribute is critical especially if your user-base is large, for instance – web-scale companies.
Monitoring: You need to quickly and accurately detect any issue with changes. Your monitoring system should be able to reveal the current state of your application and service without any considerable lag in time.
Safe rollback: The CM system should rollback quickly and safely when needed. Do not attempt any change in your environment without having a bulletproof rollback plan.

Role of automation

Many of you are aware of the concept of automation, however a lot of organizations lack automation. To increase the velocity of releases, which is an important part of running an Agile organization, manual operations must be eliminated. This can be accomplished by using Continuous Integration and Continuous Delivery but it is only effective when most of the operations are fully automated. This naturally eliminates human errors due to fatigue and carelessness. By virtue, auto-scaling which is an important function of cloud-based applications requires no manual intervention. This process needs to be completely automated.

Progressive rollouts for SREs: deploying changes progressively

Changes to configuration files and binaries have serious consequences, in other words when you make a change to an existing production system, you are at serious risk of impacting the end-user experience.

For this reason, when you deploy changes progressively instead of all at once you can reduce the impact when things go wrong. If we need to roll back, the effort is generally smaller when the changes are done in a progressive manner. The idea here is, that you would start your change with a smaller set of clients. If you find an issue with the change, you can rollback the change immediately because the size of the impact is small at that point.

There is an exception to the progressive rollout, you can rollout the change globally all at once if it is an emergency fix and it is warranted to do so.

Pitfalls to progressive rollouts

Rollout and rollback can get complex because you are dealing with multiple stages of a release. Lack of required traffic can undermine the effectiveness of a release. Especially if in the initial stages you are targeting a smaller set of clients in your rollout. The danger is that, you may prematurely sign off on a release based on a smaller set of clients. It also releases a pipline where you run one script with multiple stages

Releases can get much longer compared to one single (big) change. In a truly web-scale application that is scattered across the globe, a change can take several days to fully rollout, which can be a problem in some instances.

Documentation is important. Especially when a stage takes a long time and it requires multiple teams to be involved to manage the change. Everything must be documented in detail in case a rollback or a roll forward is warranted.

Due to these pitfalls, it is advised that you take a deeper look into your organization change rollout strategy. While progressive rollout is efficient and recommended, if your application is small enough and does not require frequent changes, a change all at once is the way to go. By doing it all at once, you have a clean way to rollback if there is a need to do so.

High level overview of progressive rollout

Once the code is committed and merged, we start a "Canary release," where canaries are the test subjects. Keep in mind that they are not a replacement for complete automated testing. The name "canary" comes from the early days of mining, when a canary bird was used to detect whether a mine contained poisonous gas before humans entering.

After the test, a small set of clients are used to rollout our changes and see how things go. Once the "canaries" are signed off, go to the next stage, which is the "Early Adaptors release." This is a slightly bigger set of clients you use to do the rollout. Finally, if the "Early Adaptors" are signed off, move to the biggest pack of the bunch: "All users."

Image by:

(Robert Kimani, CC BY-SA 4.0)

"Blast radius" refers to the size of the impact if something goes wrong. It is the smallest when we do the canary rollout and actually the biggest when we rollout to all users.

Options for progressive rollouts

A progressive rollout is either dependent on an application or an organization. For global applications, a geography-based method is an option. For instance you can choose to release to the Americas first, followed by Europe and regions of Asia. When your rollout is dependent on departments within an organization, you can use the classic progressive rollout model, used by many web-scale companies. For instance, you could start off with "Canaries", HR, Marketing, and then customers.

It's common to choose internal departments as the first clients for progressive rollouts, and then gradually move on to the external users.

You can also choose a size-based progressive rollout. Suppose you have one-thousand servers running your application. You could start off with 10% in the beginning, then pump up the rollout to 25%, 50%, 75%, and finally 100%. In this way, you can only affect a smaller set of servers as you advance through your progressive rollout.

There are periods where an application must run 2 different versions simultaneously. This is something you cannot avoid in progressive rollout situations.

Binary and configuration packages

There are three major components of a system: binary: (software), data (for instance, a database), and configuration (the parameters that govern the behavior of an application).

It's considered best practice to keep binary and configuration files separate from one another. You want to use version controlled configuration. Your configurations must be "hermetic." At any given time, when the configuration is derived by the application, it's the same regardless of when and where the configurations are derived. This is achieved by treating configuration as code.

Monitoring for SREs

Monitoring is a foundation capability of an SRE organization. You need to know if something is wrong with your application that affects the end-user experience. In addition, your monitoring should help you identify the root cause.

The primary functions of monitoring are:

Provides visibility into service health.
Allows you to create alerts based on a custom threshold.
Analyzes trends and plan capacity.
Provides detailed insight into various subsystems that make up your application or service.
Provides Code-level metrics to understand behavior.
Makes use of visualization and reports.

Data Sources for Monitoring

You can monitor several aspects of your environment. These include:

Raw logs: Generally unstructured generated from your application or a server or network devices.
Structured event logs: Easy to consume information. For example Windows Event Viewer logs.
Metrics: A numeric measurement of a component.
Distributed tracing: Trace events are generally either created automatically by frameworks, such as open telemetry, or manually using your own code.
Event introspection: Helps to examine properties at runtime at a detailed level.

When choosing a monitoring tool for your SRE organization, you must consider what's most important.

Speed

How fast can you retrieve and send data into the monitoring system?

How fresh the data should be? The fresher the data, the better. You don't want to be looking at data that's 2 hours old. You want the data to be as real-time as possible.
Ingesting data and alerting of real-time data can be expensive. You may have to invest in a platform like Splunk or InfluxDB or ElasticSearch to fully implement this.
Consider your service level objective (SLO) – to determine how fast the monitoring system should be. For instance, if your SLO is 2 hours, you do not have to invest in systems that process machine data in real-time.
Querying vast amounts of data can be inefficient. You may have to invest in enterprise platforms if you need very fast retrieval of data.

Resolution check

What is the granularity of the monitoring data?

Do you really need to record data every second? The recommended way is to use aggregation wherever possible.
Use sampling if it makes sense for your data.
Metrics are suited for high-resolution monitoring instead of raw log files.

Alerting

What alert capabilities can the monitoring tool provide?

Ensure the monitoring system can be integrated with other event processing tools or third party tools. For instance, can your monitoring system page someone in case of emergency? Can your monitoring system integrate with a ticketing system?

You should also classify the alerts with different severity levels. You may want to choose a severity level of three for a slow application versus a severity level of one for an application that is not available. Make sure the alerts can be easily suppressed to avoid alert flooding. Email or page flooding can be very distracting to the On-Call experience. There must be an efficient way to suppress the alerts.

User interface check

How versatile is it?

Does your monitoring tool provide feature-rich visualization tools?
Can it show time series data as well as custom charts effectively?
Can it be easily shared? This is important because you may want to share what you found not only with other team members but you may have to share certain information with leadership.
Can it be managed using code? You don't want to be a full-time monitoring administrator. You need to be able to manage your monitoring system through code.

Metrics

Metrics may not be efficient in identifying the root cause of a problem. It can tell what's going on in the system, but it can't tell you why it's happening. They are suitable for low-cardinality data, when you do not have millions of unique values in your data.

Numerical measurement of a property.
A counter accompanied by attributes.
Efficient to ingest.
Efficient to query.
It may not be efficient in identifying the root cause. Metrics can tell what's going on in the system but it won't be able to tell you why that's happening.
Suitable for low-cardinality data – When you do not have millions of unique values in your data.

Logs

Raw text data is usually arbitrary text filled with debug data. Parsing is generally required to get at the data. Data retrieval and recall is slower than using metrics. Raw text data is useful to determine the root causes of many problems and there are no strict requirements in terms of the cardinaltiy of data.

Arbitrary text, usually filled with debug data.
Generally parsing is required.
Generally slower than metrics, both to ingest and to retrieve.
Most of the times you will need raw logs to determine the root cause.
No strict requirements in-terms of cardinality of data.

You should use metrics because they can be assimilated, indexed and retrieved at a fast pace compared to logs. Analyzing with metrics and logs are fast, so you can give an alert fast. In contrast, logs are actually required for root cause analysis (RCA).

4 signals to monitor

There's a lot you can monitor, and at some point you have to decide what's important.

Latency: What are the end-users experiencing when it comes to responsiveness from your application.
Errors: This can be both Hard errors such as an HTTP:500 internal server error or Soft errors, which could refer to a functionality error. It could also mean a slow response time of a particular component within your application.
Traffic: Refers to the total number of requests coming in.
Saturation: Generally occurs in a component or a resource when it cannot handle the load anymore.

Monitoring resources

Data has to be derived from somewhere. Here are common resources used in building a monitoring system:

CPU: In some cases CPU utilization can indicate an underlying problem.
Memory: Application and System memory. Application memory could be the Java heap size in a Java application.
Disk I/O: Many applications are heavy I/O dependent, so it's important to monitor disk performance.
Disk volume: Monitors the sizes of all your file-systems.
Network bandwidth: It's critical to monitor the network bandwidth utilized by your application. This can provide insight into eliminating performance bottlenecks.

3 best practices for monitoring for SREs

Above all else, remember the three best practices for an effective monitoring system in your SRE organization:

Configuration as code: Makes it easy to deploy monitoring to new environments.
Unified dashboards: Converge to a unified pattern that enables reuse of the dashboards.
Consistency: Whatever monitoring tool you use, the components that you create within the monitoring tool should follow a consistent naming convention.

Rolling back changes

To minimize user impact when change did not go as expected, you should buy time to fix bugs. With fine-grained rollback, you are able to rollback only a portion of your change that was impacted, thus minimizing overall user impact.

If things don't go well during your "canary" release, you may want to roll back your changes. When combined with progressive rollouts, it's possible to completely eliminate user impact when you have a solid rollback mechanism in place.

Rollback fast and rollback often. Your rollback process will become bulletproof over time!

Mechanics of rollback

Automation is key. You need to have scripts and processes in place before you attempt a rollback. One of the ways application developers rollback a change is to simply toggle flags as part of the configuration. A new feature in your application can be turned on and off based on simply switching a flag.

The entire rollback could be a configuration file release. In general, a rollback of the entire release is more preferred than a partial rollback. Use a package management system with version numbers and labels that are clearly documented.

A rollback is still a change, technically speaking. You have already made a change and you are reverting it back. Most cases entail a scenario that was not tested before so you have to be cautious when it comes to rollbacks.

Roll forward

With roll forward, instead of rolling back your changes, you release a quick fix "Hot Fix," an upgraded software that includes the fixes. Rolling forward may not always be possible. You might have to run the system in degraded status until an upgrade is available so the "roll forward is fully complete." In some cases, rolling forward may be safer than a rollback, especially when the change involves multiple sub-systems.

Change is good

Automation is key. Your builds, tests, and releases should all be automated.

Use "canaries" for catching issues early, but remember that "canaries" are not a replacement for automated testing.

Monitoring should be designed to meet your service level objectives. Choose your monitoring tools carefully. You may have to deploy more than one monitoring system.

Finally, there are three tenets of an effective CM system:

Progressive rollout: Strive to do your changes in a progressive manner.
Monitoring: A foundational capability for your SRE teams.
Safe and fast rollbacks: Do this with processes and automation in place which increase confidence in your SRE organization functionality.

In the next article, the third part of this series, I will cover some important technical topics when it comes to SRE best practices. These topics will include the Circuit Breaker Pattern, self healing systems, distributed consensus, effective load balancing, autoscaling, and effective health check.

The three core tenets of effective change management for SREs are progressive rollouts, monitoring, and safe and fast rollbacks.

Image by:

Opensource.com

Careers DevOps What to read next What you need to know about site reliability engineering How SREs can achieve effective incident response This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How SREs can achieve effective incident response

Tue, 06/21/2022 - 15:00

How SREs can achieve effective incident response Robert Kimani Tue, 06/21/2022 - 03:00 Register or Login to like Register or Login to like

Incident response includes monitoring, detecting, and reacting to unplanned events such as security breaches or other service interruptions. The goal is to get back to business, satisfy service level agreements (SLAs), and provide services to employees and customers. Incident response is the planned reaction to a breach or interruption. One goal is to avoid unmanaged incidents.

Establish an on-call system

One way of responding is to establish an on-call system. These are the steps to consider when you're setting up an on-call system:

Design an effective on-call system
Understand managed vs. unmanaged incidents
Build and implement an effective postmortem process
Learn the tools and templates for postmortems

Understand managed and unmanaged incidents

An unmanaged incident is an issue that an on-call engineer handles, often with whatever team member happens to be available to help. More often than not, unmanaged incidents become serious issues because they are not handled correctly. Issues include:

No clear roles.
No incident command.
Random team members involved (freelancing), the primary killer of the management process.
Poor (or lack of) communication.
No central body running troubleshooting.

A managed incident is one handled with clearly defined procedures and roles. Even when an incident isn't anticipated, it's still met with a team that's prepared. A managed incident is ideal. It includes:

Clearly defined roles.
Designated incident command that leads the effort.
Only the ops-team defined by the incident command updates systems.
A dedicated communications role exists until a communication person is identified. The Incident Command can fill in this role.
A recognized command post such as a "war room." Some organizations have a defined "war room bridge number" where all the incidents are handled.

Incident management takes place in a war room. The Incident Command is the role that leads the war room. This role is also responsible for organizing people around the operations team, planning, and communication.

The Operations Team is the only team that can touch the production systems. Hint: Next time you join an incident management team, the first question to ask is, Who is running the Incident Command?

Incident management roles clearly define who is responsible for what activities. These roles should be established ahead of time and well-understood by all participants.

Incident Command: Runs the war room and assigns responsibilities to others.

Operations Team: Only role allowed to make changes to the production system.

Communication Team: Provides periodic updates to stakeholders such as the business partners or senior executives.

Planning Team: Supports operations by handling long-term items such as providing bug fixes, postmortems, and anything that requires a planning perspective.

As an SRE, you'll probably find yourself in the Operations Team role, but you may also have to fill other roles.

Build and implement an effective postmortem process

Postmortem is a critical part of incident management that occurs once the incident is resolved.

Why postmortem?

Fully understand/document the incident using postmortems. You can ask questions such as "What could have been done differently?"
Conduct a deep dive "root cause" analysis, producing valuable insights.
Learn from the incident. This is the primary benefit of doing postmortems.
Identify opportunities for prevention as part of postmortem analysis, e.g., identify a monitoring enhancement to catch an issue sooner in the future.
Plan and follow through with assigned activities as part of the postmortem.

Blameless postmortem: a fundamental tenet of SRE

No finger-pointing. People are quite honestly scared about postmortems because one person or team may be held responsible for the outage. Avoid finger-pointing at all costs; instead, focus solely on systems and processes and not on individuals. Isolating individuals/teams can create an unhealthy culture. For instance, the next time someone commits a mistake, they will not come forward and accept it. They may hide the activity due to the fear of being blamed.

Though there is no room for finger-pointing, the postmortem must call out improvement opportunities. This approach helps avoid further similar incidents.

When is a postmortem needed?

Is a postmortem necessary for all incidents or only for certain situations? Here are some suggestions for when a postmortem is useful:

End-user experience impact beyond a threshold (SLO). If the SLO in place is impacted due to:
- Unavailable services
- Unacceptable performance
- Erratic functionality
Data loss.
Organization/group-specific requirements with different policies and protocols to follow.

Six minimum items required in a postmortem

The postmortem should include the following six components:

Summary: Provide a succinct incident summary.
Impact (must include any financial impact): Executives will look for impact and financial information.
Root cause(s): Identify the root cause, if possible.
Resolution: What the team actually did to fix the issue.
Monitoring (issue detection): Specify how the incident was identified. Hopefully, this was a monitoring system rather than an end-user complaint.
Action items with due dates and owners: This is important. Do not simply conduct a postmortem and forget the incident. Establish action items, assign owners, and follow through on these. Some organizations may also include a detailed timeline of occurrences in the postmortem, which can be useful to walk through the sequence of events.

Before the postmortem is published, a supervisor or senior team member(s) must review the document to avoid any errors or misrepresentation of facts.

Find postmortem tools and templates

If you haven't done postmortems before, you may be wondering how to get started. You've learned a lot about postmortems thus far, but how do you actually implement one?

That's where tools and templates come into play. There are many tools available. Consider the following:

Existing ITSM tools in your organization. Popular examples include ServiceNow, Remedy, Atlassian ITSM, etc. Existing tools likely provide postmortem tracking capabilities.
Open source tools are also available, the most popular being Morgue, released by Etsy. Another popular choice is PagerDuty.
Develop your own. Remember, SREs are also software engineers! It doesn't have to be fancy, but it must have an easy-to-use interface and a way to store the data reliably.
Templates. These are documents that you can readily use to track your postmortems. There are many templates available, but the most popular ones are:

Google: Postmortem Culture: Learning from Failure and Example Postmortem
Pagerduty: The Postmortem
Atlassian: Root Cause Analysis – The 5 whys?
- Incident Postmortem Template
- Incident Postmortems
Splunk On-Call, formerly VictorOps
Other GitHub Template resources
A custom in-house template: This may be the most effective option as it suits your organization's needs.

Wrap up

Here are the key points for the above incident response discussion:

Effective on-call system is necessary to ensure service availability and health.
Balance workload for on-call engineers.
- Allocate resources.
- Use multi-region support.
- Promote a safe and positive environment.
Incident management must facilitate a clear separation of duties.
- Incident command, operations, planning, and communication.
Blameless postmortems help prevent repeated incidents.

Incident management is only one side of the equation. For an SRE organization to be effective, it must also have a change management system in place. After all, changes cause many incidents.

The next article looks at ways to apply effective change management.

7 summer book recommendations from open source enthusiasts

Tue, 06/21/2022 - 15:00

7 summer book recommendations from open source enthusiasts Joshua Allen Holm Tue, 06/21/2022 - 03:00 1 reader likes this 1 reader likes this

It is my great pleasure to introduce Opensource.com's 2022 summer reading list. This year's list contains seven wonderful reading recommendations from members of the Opensource.com community. You will find a nice mix of books covering everything from a fun cozy mystery to non-fiction works that explore thought-provoking topics. I hope you find something on this list that interests you.

Enjoy!

Image by:

O'Reilly Press

97 Things Every Java Programmer Should Know: Collective Wisdom from the Experts, edited by Kevlin Henney and Trisha Gee

Recommendation written by Seth Kenlon

Written by 73 different authors working in all aspects of the software industry, the secret to this book's greatness is that it actually applies to much more than just Java programming. Of course, some chapters lean into Java, but there are topics like Be aware of your container surroundings, Deliver better software, faster, and Don't hIDE your tools that apply to development regardless of language.

Better still, some chapters apply to life in general. Break problems and tasks into small chunks is good advice on how to tackle any problem, Build diverse teams is important for every group of collaborators, and From puzzles to products is a fascinating look at how the mind of a puzzle-solver can apply to many different job roles.

Each chapter is just a few pages, and with 97 to choose from, it's easy to skip over the ones that don't apply to you. Whether you write Java code all day, just dabble, or if you haven't yet started, this is a great book for geeks interested in code and the process of software development.

Image by:

Princeton University Press

A City is Not a Computer: Other Urban Intelligences, by Shannon Mattern

Recommendation written by Scott Nesbitt

These days, it's become fashionable (if not inevitable) to make everything smart: Our phones, our household appliances, our watches, our cars, and, especially, our cities.

With the latter, that means putting sensors everywhere, collecting data as we go about our business, and pushing information (whether useful or not) to us based on that data.

This begs the question, does embedding all that technology in a city make it smart? In A City Is Not a Computer, Shannon Mattern argues that it doesn't.

A goal of making cities smart is to provide better engagement with and services to citizens. Mattern points out that smart cities often "aim to merge the ideologies of technocratic managerialism and public service, to reprogram citizens as 'consumers' and 'users'." That, instead of encouraging citizens to be active participants in their cities' wider life and governance.

Then there's the data that smart systems collect. We don't know what and how much is being gathered. We don't know how it's being used and by whom. There's so much data being collected that it overwhelms the municipal workers who deal with it. They can't process it all, so they focus on low-hanging fruit while ignoring deeper and more pressing problems. That definitely wasn't what cities were promised when they were sold smart systems as a balm for their urban woes.

A City Is Not a Computer is a short, dense, well-researched polemic against embracing smart cities because technologists believe we should. The book makes us think about the purpose of a smart city, who really benefits from making a city smart, and makes us question whether we need to or even should do that.

Image by:

Tilted Windmill Press

git sync murder, by Michael Warren Lucas

Recommendation written by Joshua Allen Holm

Dale Whitehead would rather stay at home and connect to the world through his computer's terminal, especially after what happened at the last conference he attended. During that conference, Dale found himself in the role of an amateur detective solving a murder. You can read about that case in the first book in this series, git commit murder.

Now, back home and attending another conference, Dale again finds himself in the role of detective. git sync murder finds Dale attending a local tech conference/sci-fi convention where a dead body is found. Was it murder or just an accident? Dale, now the "expert" on these matters, finds himself dragged into the situation and takes it upon himself to figure out what happened. To say much more than that would spoil things, so I will just say git sync murder is engaging and enjoyable to read. Reading git commit murder first is not necessary to enjoy git sync murder, but I highly recommend both books in the series.

Michael Warren Lucas's git murder series is perfect for techies who also love cozy mysteries. Lucas has literally written the book on many complex technical topics, and it carries over to his fiction writing. The characters in git sync murder talk tech at conference booths and conference social events. If you have not been to a conference recently because of COVID and miss the experience, Lucas will transport you to a tech conference with the added twist of a murder mystery to solve. Dale Whitehead is an interesting, if somewhat unorthodox, cozy mystery protagonist, and I think most Opensource.com readers would enjoy attending a tech conference with him as he finds himself thrust into the role of amateur sleuth.

Image by:

Inner Wings Foundation

Kick Like a Girl, by Melissa Di Donato Roos

Recommendation written by Joshua Allen Holm

Nobody likes to be excluded, but that is what happens to Francesca when she wants to play football at the local park. The boys won't play with her because she's a girl, so she goes home upset. Her mother consoles her by relating stories about various famous women who have made an impact in some significant way. The historical figures detailed in Kick Like a Girl include women from throughout history and from many different fields. Readers will learn about Frida Kahlo, Madeleine Albright, Ada Lovelace, Rosa Parks, Amelia Earhart, Marie Curie, Valentina Tereshkova, Florence Nightingale, and Malala Yousafzai. After hearing the stories of these inspiring figures, Francesca goes back to the park and challenges the boys to a football match.

Kick Like a Girl features engaging writing by Melissa Di Donato Roos (SUSE's CEO) and excellent illustrations by Ange Allen. This book is perfect for young readers, who will enjoy the rhyming text and colorful illustrations. Di Donato Roos has also written two other books for children, How Do Mermaids Poo? and The Magic Box, both of which are also worth checking out.

Image by:

Doubleday

Mine!: How the Hidden Rules of Ownership Control Our Lives, by Michael Heller and James Salzman

Recommendation written by Bryan Behrenshausen

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources

"A lot of what you know about ownership is wrong," authors Michael Heller and James Salzman write in Mine! It's the kind of confrontational invitation people drawn to open source can't help but accept. And this book is certainly one for open source aficionados, whose views on ownership—of code, of ideas, of intellectual property of all kinds—tend to differ from mainstream opinions and received wisdom. In this book, Heller and Salzman lay out the "hidden rules of ownership" that govern who controls access to what. These rules are subtle, powerful, deeply historical conventions that have become so commonplace they just seem incontrovertible. We know this because they've become platitudes: "First come, first served" or "You reap what you sow." Yet we see them play out everywhere: On airplanes in fights over precious legroom, in the streets as neighbors scuffle over freshly shoveled parking spaces, and in courts as juries decide who controls your inheritance and your DNA. Could alternate theories of ownership create space for rethinking some essential rights in the digital age? The authors certainly think so. And if they're correct, we might respond: Can open source software serve as a model for how ownership works—or doesn't—in the future?

Image by:

Lulu.com

Not All Fairy Tales Have Happy Endings: The Rise and Fall of Sierra On-Line, by Ken Williams

Recommendation written by Joshua Allen Holm

During the 1980s and 1990s, Sierra On-Line was a juggernaut in the computer software industry. From humble beginnings, this company, founded by Ken and Roberta Williams, published many iconic computer games. King's Quest, Space Quest, Quest for Glory, Leisure Suit Larry, and Gabriel Knight are just a few of the company's biggest franchises.

Not All Fairy Tales Have Happy Endings covers everything from the creation of Sierra's first game, Mystery House, to the company's unfortunate and disastrous acquisition by CUC International and the aftermath. The Sierra brand would live on for a while after the acquisition, but the Sierra founded by the Williams was no more. Ken Williams recounts the entire history of Sierra in a way that only he could. His chronological narrative is interspersed with chapters providing advice about management and computer programming. Ken Williams had been out of the industry for many years by the time he wrote this book, but his advice is still extremely relevant.

Sierra On-Line is no more, but the company made a lasting impact on the computer gaming industry. Not All Fairy Tales Have Happy Endings is a worthwhile read for anyone interested in the history of computer software. Sierra On-Line was at the forefront of game development during its heyday, and there are many valuable lessons to learn from the man who led the company during those exciting times.

Image by:

Back Bay Books

The Soul of a New Machine, by Tracy Kidder

Recommendation written by Guarav Kamathe

I am an avid reader of the history of computing. It's fascinating to know how these intelligent machines that we have become so dependent on (and often take for granted) came into being. I first heard of The Soul of a New Machine via Bryan Cantrill's blog post. This is a non-fiction book written by Tracy Kidder and published in 1981 for which he won a Pulitzer prize. Imagine it's the 1970s, and you are part of the engineering team tasked with designing the next generation computer. The backdrop of the story begins at Data General Corporation, a then mini-computer vendor who was racing against time to compete with the 32-bit VAX computers from Digital Equipment Corporation (DEC). The book outlines how two competing teams within Data General, both wanting to take a shot at designing the new machine, results in a feud. What follows is a fascinating look at the events that unfold. The book provides insights into the minds of the engineers involved, the management, their work environment, the technical challenges they faced along the way and how they overcame them, how stress affected their personal lives, and much more. Anybody who wants to know what goes into making a computer should read this book.

There is the 2022 suggested reading list. It provides a variety of great options that I believe will provide Opensource.com readers with many hours of thought-provoking entertainment. Be sure to check out our previous reading lists for even more book recommendations.

Members of the Opensource.com community recommend this mix of books covering everything from a fun cozy mystery to non-fiction works that explore thought-provoking topics.

Image by:

Photo by Carolyn V on Unsplash

Opensource.com community What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How I use the attr command with my Linux filesystem

Mon, 06/20/2022 - 15:00

How I use the attr command with my Linux filesystem Seth Kenlon Mon, 06/20/2022 - 03:00 Register or Login to like Register or Login to like

The term filesystem is a fancy word to describe how your computer keeps track of all the files you create. Whether it's an office document, a configuration file, or thousands of digital photos, your computer has to store a lot of data in a way that's useful for both you and it. Filesystems like Ext4, XFS, JFS, BtrFS, and so on are the "languages" your computer uses to keep track of data.

Your desktop or terminal can do a lot to help you find your data quickly. Your file manager might have, for instance, a filter function so you can quickly see just the image files in your home directory, or it might have a search function that can locate a file by its filename, and so on. These qualities are known as file attributes because they are exactly that: Attributes of the data object, defined by code in file headers and within the filesystem itself. Most filesystems record standard file attributes such as filename, file size, file type, time stamps for when it was created, and time stamps for when it was last visited.

I use the open source XFS filesystem on my computers not for its reliability and high performance but for the subtle convenience of extended attributes.

Common file attributes

When you save a file, data about it are saved along with it. Common attributes tell your operating system whether to update the access time, when to synchronize the data in the file back to disk, and other logistical details. Which attributes get saved depends on the capabilities and features of the underlying filesystem.

In addition to standard file attributes (insofar as there are standard attributes), the XFS, Ext4, and BtrFS filesystems can all use extending filesystems.

Extended attributes

XFS, Ext4, and BtrFS allow you to create your own arbitrary file attributes. Because you're making up attributes, there's nothing built into your operating system to utilize them, but I use them as "tags" for files in much the same way I use EXIF data on photos. Developers might choose to use extended attributes to develop custom capabilities in applications.

There are two "namespaces" for attributes in XFS: user and root. When creating an attribute, you must add your attribute to one of these namespaces. To add an attribute to the root namespace, you must use the sudo command or be logged in as root.

Add an attribute

You can add an attribute to a file on an XFS filesystem with the attr or setfattr commands.

The attr command assumes the user namespace, so you only have to set (-s) a name for your attribute followed by a value (-V):

$ attr -s flavor -V vanilla example.txt
Attribute "flavor" set to a 7 byte value for example.txt:
vanilla

The setfattr command requires that you specify the target namespace:

$ setfattr --name user.flavor --value chocolate example.txtList extended file attributes

Use the attr or getfattr commands to see extended attributes you've added to a file. The attr command defaults to the user namespace and uses the -g option to get extended attributes:

$ attr -g flavor example.txt
Attribute "flavor" had a 9 byte value for example.txt:
chocolate

The getfattr command requires the namespace and name of the attribute:

$ getfattr --name user.flavor example.txt
# file: example.txt
user.flavor="chocolate"List all extended attributes

To see all extended attributes on a file, you can use attr -l:

$ attr -l example.txt
Attribute "md5sum" has a 32 byte value for example.txt
Attribute "flavor" has a 9 byte value for example.txt

Alternately, you can use getfattr -d:

$ getfattr -d example.txt
# file: example.txt
user.flavor="chocolate"
user.md5sum="969181e76237567018e14fe1448dfd11"

Any extended file attribute can be updated with attr or setfattr, just as if you were creating the attribute:

$ setfattr --name user.flavor --value strawberry example.txt

$ getfattr -d example.txt
# file: example.txt
user.flavor="strawberry"
user.md5sum="969181e76237567018e14fe1448dfd11"Attributes on other filesystems

The greatest risk when using extended attributes is forgetting that these attributes are specific to the filesystem they're on. That means when you copy a file from one drive or partition to another, the attributes are lost even if the target filesystem supports extended attributes.

To avoid losing extended attributes, you must use a tool that supports retaining them, such as the rsync command.

$ rsync --archive --xattrs ~/example.txt /tmp/

No matter what tool you use, if you transfer a file to a filesystem that doesn't know what to do with extended attributes, those attributes are dropped.

There aren't many mechanisms to interact with extended attributes, so the options for using the file attributes you've added are limited. I use extended attributes as a tagging mechanism, which allows me to associate files that have no obvious relation to one another. For instance, suppose I need a Creative Commons graphic for a project I'm working on. Assume I've had the foresight to add the extended attribute license to my collection of graphics. I could search my graphic folder with find and getfattr together:

find ~/Graphics/ -type f \
-exec getfattr \
--name user.license \
-m cc-by-sa {} \; 2>/dev/null

# file: /home/tux/Graphics/Linux/kde-eco-award.png
user.license="cc-by-sa"
user.md5sum="969181e76237567018e14fe1448dfd11"Secrets of your filesystem

Filesystems aren't generally something you're meant to notice. They're literally systems for defining a file. It's not the most exciting task a computer performs, and it's not something users are supposed to have to be concerned with. But some filesystems give you some fun, and safe, special abilities, and extended file attributes are a good example. Its use may be limited, but extended attributes are a unique way to add context to your data.

I use the open source XFS filesystem because of the subtle convenience of extended attributes. Extended attributes are a unique way to add context to my data.

Image by:

Internet Archive Book Images. Modified by Opensource.com. CC BY-SA 4.0

Linux Command line What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

What you need to know about site reliability engineering

Mon, 06/20/2022 - 15:00

What you need to know about site reliability engineering Robert Kimani Mon, 06/20/2022 - 03:00 Register or Login to like Register or Login to like

What is site reliability engineering? The creator of the first site reliability engineering (SRE) program, Benjamin Treynor Sloss at Google, described it this way:

Site reliability engineering is what happens when you ask a software engineer to design an operations team.

What does that mean? Unlike traditional system administrators, site reliability engineers (SREs) apply solid software engineering principles to their day-to-day work. For laypeople, a clearer definition might be:

Site reliability engineering is the discipline of building and supporting modern production systems at scale.

SREs are responsible for maximizing reliability, performance availability, latency, efficiency, monitoring, emergency response, change management, release planning, and capacity planning for both infrastructure and software. As applications and infrastructure grow more complex, SRE teams help ensure that these systems can evolve.

What does an SRE organization do?

There are four primary responsibilities of an SRE organization:

Availability: SREs are responsible for the availability of the services they support. After all, if services are not available, end users' work is disrupted, which can cause serious damage to your organization's credibility.
Performance: A service needs to be not only available but also highly performant. For example, how useful is a website that takes 20 seconds to move from one page to another?
Incident management: SREs manage the response to unplanned disruptions that impact customers, such as outages, service degradation, or interruptions to business operations.
Monitoring: A foundational requirement for every SRE, monitoring involves collecting, processing, aggregating, and displaying real-time quantitative data about a system. This could include query counts and types, error counts and types, processing times, and server lifetimes.

Occasionally, release and capacity planning are also the responsibility of the SRE organization.

More open source career advice Open source cheat sheets Linux starter kit for developers 7 questions sysadmins should ask a potential employer before taking a job Resources for IT artchitects Cheat sheet: IT job interviews How do SREs maintain site reliability?

The SRE role is a diverse one, with many responsibilities. An SRE must be able to identify an issue quickly, troubleshoot, and mitigate it with minimal disruption to operations.

Here's a partial list of the tasks a typical SRE undertakes:

Writing code: An SRE is required to solve problems using software, whether they are a software engineer with an operations background or a system engineer with a development background.
Being on call: This is not the most attractive part of being an SRE, but it is essential.
Leading a war room: SREs facilitate discussions of strategy and execution during incident management.
Performing postmortems: This is an excellent tool to learn from an incident and identify processes that can be put in place to avoid future incidents.
Automating: SREs tend to get bored with manual steps. Automation not only saves time but reduces failures due to human errors. Spending some time on engineering by automating tasks can have a strong return on investment.
Implement best practices: SREs are well versed with distributed systems and web-scale architectures. They apply best practices in several areas of service management.

Designing an effective on-call system

An on-call management system streamlines the process of adding members of the SRE team into after-hours or weekend call schedules, assigning them equitable responsibility for managing alerts outside of traditional work hours or on holidays. In some cases, an organization might designate on-call SREs around the clock.

In the medical profession, on-call doctors don't have to be on site, but they do have to be prepared to show up and deal with emergencies anytime during their on-call shift. SRE professionals likewise use on-call schedules to make sure that someone's always there to respond to major bugs, capacity issues, or product downtime. If they can't fix the problem on their own, they're also responsible for escalating the issue. For SRE teams who run services for which customers expect 24/7/365, 99.999% uptime and availability, on-call staffing is especially critical.

There are two main kinds of on-call design structures that can be used when designing an on-call system, and they focus on domain expertise and ownership of a given service:

Single-team ownership model
Shared ownership model

In most cases, single-team ownership will be the better model.

The on-call SRE has multiple duties:

Protecting production systems: The SRE on call serves as a guardian to all production services they are required to support.
Responding to emergencies within acceptable time: Your organization may choose to have a service-level objective (SLO) for SRE response time. In most cases, anywhere between 5 to 15 minutes would be an acceptable response time. Automated monitoring and alerting solutions also empower SREs to respond immediately to any interruptions to service availability.
Involving team members and escalating issues: The on-call SRE is responsible for identifying and calling in the right team members to address specific problems.
Tackling non-emergent issues: In some organizations, a secondary on-call engineer is scheduled to handle non-emergencies, like email alerts.
Writing postmortems: As noted above, a good postmortem is a valuable tool for documenting and learning from significant incidents.

3 key tenets of an effective on-call management system A focus on engineering

SREs should be spending more time designing solutions than applying band-aids. A general guideline is for SREs to spend 50% of their time in engineering work, such as writing code and automating tasks. When an SRE is on-call, time should be split between about 25% of time managing incidents and 25% on operations duty.

Balanced workload

Being on call can quickly burn out an engineer if there are too many tickets to handle. If well-coordinated multi-region support is possible, such as a US-based team and an Asia-Pacific team, that arrangement can help limit the detrimental health effects of repeated night shifts. Otherwise, having six to eight SREs per site will help avoid exhaustion. At the same time, make sure all SREs are getting a turn being on call at least once or twice a quarter to avoid getting out of touch with production systems. Fair compensation for on-call work during overnights or holidays, such as additional hours off or cash awards, will also help SREs feel that their extra effort is appreciated.

Positive and safe environment

Clearly defined escalation and blameless postmortem procedures are absolutely necessary for SREs to be effective and productive. Established protocols are central to a robust incident management system. Postmortems must focus on root causes and prevention rather than individual and team actions. If you don't have a clear postmortem procedure in your organization, it is wise to start one immediately.

SRE best practices

This article covered some SRE basics and best practices for establishing and running an SRE on-call management system.

In future articles, I will look at other categories of best practices for SRE, the technologies involved, and the processes to support those technologies. By the end of this series, you'll know how to implement SRE best practices for designing, implementing, and supporting production systems.

More resources

Understand the basics and best practices for establishing and maintaining an SRE program in your organization.

Image by:

opensource.com

Careers DevOps What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How Wordle inspired me to create a 3D printing wiki the open source way

Sat, 06/18/2022 - 15:00

How Wordle inspired me to create a 3D printing wiki the open source way Adam Bute Sat, 06/18/2022 - 03:00 Register or Login to like Register or Login to like

You decide to buy a 3D printer. You do your research, and you settle on an open system that uses resin as its material. You spend a nice chunk of money, and after a few weeks of waiting, it finally arrives.

You unbox it. It's gorgeous. You do some small assembly, pour in the liquid resin, and you're ready to go. You fire up the software. It asks you to type in the correct parameters for the material. You check the bottle but you can't see any parameters. You check online, but still can't find anything.

A bit confused, you write an email to the manufacturer, asking if they could point you in the right direction. The manufacturer tells you they also don't know the parameters, but they are pretty sure they exist, and you should try to guess them yourself. Baffled, you start wondering whether this is really what resin printing is like, or if you've been duped by this company.

A bad game of Wordle

Unfortunately, this is really what resin printing is like. When you buy a new material, you have to do what's called resin validation. It's essentially making a guess for the parameters and adjusting the numbers based on your results. That's if your guess was decent and anything even comes out of the printer.

It's a lot like playing a game of Wordle, except none of the blocks ever turn green. All you can do is eyeball whether the print looks slightly better or worse with each iteration, and then try again. Finally, at one point you say “looks good to me,” and that's that.

Image by:

( Adam Bute, CC BY-SA 40)

If that sounds like a deeply unsatisfying game to you, you'd be correct. It's also way too long, often taking days, and wasting a good amount of resin in the process. Resin is expensive. At least Wordle is free.

A little help from above?

So why don't these manufacturers, who know their material best, share the print parameters? Well, their arguments are somewhat reasonable. There are millions of possible combinations of printers and resins, and they can't possibly cover all of them. And even between two printers which are the same model, there can be tiny variations, which affect the numbers slightly.

But ‘slightly' is an important word. If users got a good baseline, they could easily adjust the settings if they needed to. At least it would be much quicker than starting from scratch. To be fair, some companies do give recommended settings, but it's hard to trust even these numbers. There are many crafty manufacturers who publish fake, untested settings, just to lure customers in.

The truth is that resin validation is expensive. Resin companies are almost invariably small businesses, strapped for cash, who just can't afford to spend on resin validation. So, they outsource this work to the end user. But this creates a kind of absurd situation in the resin printing world. Instead of one man at one company doing the validation work once, hundreds of people do the same work over and over again just to come to the same conclusion.

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources Makers to the rescue

So what does the maker community do? They try to fix the problem themselves. Reddit and Facebook groups are full of people happily sharing screenshots of a new setting they figured out. Nice gesture, but not very useful. Some ingenious community members created live resin setting spreadsheets that are updated based on user submissions. These are fantastic resources, and universally loved, but they too have their limitations.

They are messy, rarely updated, and there's no way to tell whether the settings actually work. And because anonymous users host them, they are sometimes turned into spam, or randomly deleted. Most recently the biggest community spreadsheet was unexpectedly deleted, erasing years' worth of crowdsourced data with it.

Image by:

(Adam Bute, CC BY-SA 40)

Making a resin setting database

I'm a bit of an egghead myself and I really enjoy organizing things. I had the idea to collect all these settings from manufacturers and communities, and put it on a nice website. I registered the domain makertrainer.com and used two awesome open source tools to build the site: MediaWiki and DataTables.

Image by:

(Adam Bute, CC BY-SA 40)

I made it a wiki so that anyone could contribute, but made certain spam or vandalism could be easily undone. I also added some buttons to allow users to vote on whether a setting worked or not. I posted it in some communities to see if people would find it useful, and the response has been overwhelming. I didn't realize this at the time, but I accidentally created the biggest resin setting database on the internet. Users kept spreading it on social media, blog posts, YouTube and so on. With so many people submitting new settings, the database grows larger daily.

Image by:

(Adam Bute, CC BY-SA 40)

The fact that people are so enthusiastic shows just how much of a void there is for documentation in 3D printing. There's a lot of practical experience scattered amongst makers, most of which is never written down or shared publicly. Everyone needs to do a better job recording practical knowledge in addition to theory. Otherwise, the next generation of makers will commit the same mistakes again. The database is a small contribution in this regard, but I hope it can continue to grow, and make 3D printing just a little easier for everyone.

3D printing is a lot like a game of Wordle. I aim to solve the puzzle by creating crowdsourced documentation and databases.

Image by:

Opensource.com

3D printing What to read next Open source tools to make your Wordle results accessible Solve Wordle using the Linux command line This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

What scrum masters can learn from dancing

Fri, 06/17/2022 - 15:00

What scrum masters can learn from dancing Kelsea Zhang Fri, 06/17/2022 - 03:00 Register or Login to like Register or Login to like

Many scrum masters have an obsession with quickly turning their teams into what they want them to be. Once their expectations are not met within some arbitrary time limit, or someone resists their ideas, then the fight begins.

But the fight takes energy, time, and resources, and most of the time it doesn't solve the problem. In fact, most of the time it worsens the problem. When this happens, it's time for some reflection about the role of the scrum master.

Priorities of the scrum master and the team

Even when a team agrees that a problem is indeed a problem, the team may not agree on what's most urgent. As the scrum master, if you ask the team to make efforts to what the team perceives as a secondary conflict, the team will likely resist. They may cooperate nominally, but in the best case you're only getting a fraction of their potential.

Don't be stubborn

Some scrum masters want to make their company look a very specific way, usually in accordance with common models or methods in the industry. Sometimes they openly confront managers who don't align with their vision of how things "should" be.

It might feel like you're "fighting the good fight" when you attempt to do this, but that doesn't mean you're making progress. You have important work to do, and while your entire company might eventually be transformed, you have to start somewhere. The more support you gather from your team, the better chance you have of spreading your agile principles throughout the organization.

A wise scrum master takes harmonious steps with their team, without stepping on each other's feet or tripping anyone up. Certainly, they could never be thought to be fighting each other. I call this "dancing together" because like dancing, it takes coordination, verbal and nonverbal communication, and cooperation. And when done even moderately well, it renders something elegant and enjoyable.

Awareness

As a scrum master, you need to reflect on yourself all the time. How are your ideas and actions being received? If there's a sense of resistance or competition, then something is probably wrong, either in how you've been communicating, or how you've been seeking feedback and participation.

As a scrum master, you need to be aware of your environment. There are ebbs and flows within an organization, times when great change is appropriate and times when fundamental groundwork must be laid. Look for that in your organization, starting with your team. Wait for the right person to be in the right place at the right time, wait for an opportunity, wait for a certain policy, and then integrate agile methodology to drive changes.

Get buy-in from the top

In most organizations, the scrum master ultimately serves the business goals of upper management. Transparency and communication are important. It's your job to understand your organization's objectives. Ask for advice from your managers, and get a clear picture of the intentions and expectations of the rest of the company. A scrum master can help management achieve their goals with expertise and efficiency, but only if you understand the objective.

Tango over Foxtrot

Most of the actual agile transformation in real enterprises is very slow. It is not a fast and furious battle. The scrum master has to step in rhythm with the pace of the enterprise. You must strike a balance between who leads and who flourishes. You don't want to move too slow, or too fast. You must not pursue perfection in everything. Allow for mistakes and misunderstandings. Don't blame each other, but stay focused on creating something vibrant together.

Like dancing, being a scrum master takes coordination, verbal and nonverbal communication, and cooperation.

Image by:

Opensource.com

DevOps Agile What to read next What does a scrum master do? 3 practical tips for agile transformation This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Analyze web pages with Python requests and Beautiful Soup

Thu, 06/16/2022 - 15:00

Analyze web pages with Python requests and Beautiful Soup Seth Kenlon Thu, 06/16/2022 - 03:00 1 reader likes this 1 reader likes this

Browsing the web probably accounts for much of your day. But it's an awfully manual process, isn't it? You have to open a browser. Go to a website. Click buttons, move a mouse. It's a lot of work. Wouldn't it be nicer to interact with the Internet through code?

You can get data from the Internet using Python with the help of the Python module requests:

import requests

DATA = "https://opensource.com/article/22/5/document-source-code-doxygen-linux"
PAGE = requests.get(DATA)

print(PAGE.text)

In this code sample, you first import the module requests. Then you create two variables: one called DATA to hold the URL you want to download. In later versions of this code, you'll be able to provide a different URL each time you run your application. For now, though, it's easiest to just "hard code” a test URL for demonstration purposes.

The other variable is PAGE, which you set to the response of the requests.get function when it reads the URL stored in DATA. The requests module and its .get function is pre-programmed to "read” an Internet address (a URL), access the Internet, and download whatever is located at that address.

That's a lot of steps you don't have to figure out on your own, and that's exactly why Python modules exist. Finally, you tell Python to print everything that requests.get has stored in the .text field of the PAGE variable.

Beautiful Soup

If you run the sample code above, you get the contents of the example URL dumped indiscriminately into your terminal. It does that because the only thing your code does with the data that requests has gathered is print it. It's more interesting to parse the text.

Python can "read” text with its most basic functions, but parsing text allows you to search for patterns, specific words, HTML tags, and so on. You could parse the text returned by requests yourself, but using a specialized module is much easier. For HTML and XML, there's the Beautiful Soup library.

This code accomplishes the same thing, but it uses Beautiful Soup to parse the downloaded text. Because Beautiful Soup recognizes HTML entities, you can use some of its built-in features to make the output a little easier for the human eye to parse.

For instance, instead of printing raw text at the end of your program, you can run the text through the .prettify function of Beautiful Soup:

from bs4 import BeautifulSoup
import requests

PAGE = requests.get("https://opensource.com/article/22/5/document-source-code-doxygen-linux")
SOUP = BeautifulSoup(PAGE.text, 'html.parser')

# Press the green button in the gutter to run the script.
if __name__ == '__main__':
# do a thing here
print(SOUP.prettify())

The output of this version of your program ensures that every opening HTML tag starts on its own line, with indentation to help demonstrate which tag is a parent of another tag. Beautiful Soup is aware of HTML tags in more ways than just how it prints it out.

Instead of printing the whole page, you can single out a specific kind of tag. For instance, try changing the print selector from print(SOUP.prettify() to this:

print(SOUP.p)

This prints just a

tag. Specifically, it prints just the first

tag encountered. To print all

tags, you need a loop.

More Python resources What is an IDE? Cheat sheet: Python 3.7 for beginners Top Python GUI frameworks Download: 7 essential PyPI libraries Red Hat Developers Latest Python articles Looping

Create a for loop to cycle over the entire webpage contained in the SOUP variable, using the find_all function of Beautiful Soup. It's not unreasonable to want to use your loop for other tags besides just the

tag, so build it as a custom function, designated by the def keyword (for "define”) in Python.

def loopit():
for TAG in SOUP.find_all('p'):
print(TAG)

The temporary variable TAG is arbitrary. You can use any term, such as ITEM or i or whatever you want. Each time the loop runs, TAG contains the search results of the find_all function. In this code, the

tag is being searched.

A function doesn't run unless it's explicitly called. You can call your function at the end of your code:

# Press the green button in the gutter to run the script.
if __name__ == '__main__':
# do a thing here
loopit()

Run your code to see all

tags and each one's contents.

Getting just the content

You can exclude tags from being printed by specifying that you want just the "string” (programming lingo for "words”).

def loopit():
for TAG in SOUP.find_all('p'):
print(TAG.string)

Of course, once you have the text of a webpage, you can parse it further with the standard Python string libraries. For instance, you can get a word count using len and split:

def loopit():
for TAG in SOUP.find_all('p'):
if TAG.string is not None:
print(len(TAG.string.split()))

This prints the number of strings within each paragraph element, omitting those paragraphs that don't have any strings. To get a grand total, use a variable and some basic math:

def loopit():
NUM = 0
for TAG in SOUP.find_all('p'):
if TAG.string is not None:
NUM = NUM + len(TAG.string.split())
print("Grand total is ", NUM)Python homework

There's a lot more information you can extract with Beautiful Soup and Python. Here are some ideas on how to improve your application:

Accept input so you can specify what URL to download and analyze when you launch your application.
Count the number of images ( tags) on a page.
Count the number of images ( tags) within another tag (for instance, only images that appear in the div, or only images following a
tag).

Follow this Python tutorial to easily extract information about web pages.

Image by:

Opensource.com

Python What to read next A guide to web scraping in Python using Beautiful Soup A beginner's guide to web scraping with Python This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Using habits to practice open organization principles

Thu, 06/16/2022 - 15:00

Using habits to practice open organization principles Ron McFarland Thu, 06/16/2022 - 03:00 1 reader likes this 1 reader likes this

Habits are a long-term interest of mine. Several years ago, I gave a presentation on habits, both good and bad, and how to expand on good habits and change bad ones. Just recently, I read the habits-focused book Smart Thinking by Art Markman. You might ask what this has to do with open organization principles. There is a connection, and I'll explain it in this two-part article on managing habits.

In this first article, I talk about habits, how they work, and—most important—how you can start to change them. In the second article, I review Markman's thoughts as presented in his book.

The intersection of principles and habits

Suppose you learned about open organization principles and although you found them interesting and valuable, you just weren't in the habit of using them. Here's how that might look in practice.

Community: If you're faced with a significant challenge but think you can't address it alone, you're likely in the habit of just giving up. Wouldn't it be better to have the habit of building a community of like-minded people that collectively can solve the problem?

Collaboration: Suppose you don't think you're a good collaborator. You like to do things alone. You know that there are cases when collaboration is required, but you don't have a habit of engaging in it. To counteract that, you must build a habit of collaborating more.

Transparency: Say you like to keep most of what you do and know a secret. However, you know that if you don't share information, you're not likely to get good information from others. Therefore, you must create the habit of being more transparent.

Inclusivity: Imagine you are uncomfortable working with people you don't know and who are different from you, whether in personality, culture, or language. You know that if you want to be successful, you must work with a wide variety of people. How do you create a habit of being more inclusive?

Adaptability: Suppose you tend to resist change long after what you're doing is no longer achieving what you had hoped it would. You know you must adapt and redirect your efforts, but how can you create a habit of being adaptive?

What is a habit?

Before I give examples regarding the above principles, I'll explain some of the relevant characteristics of a habit.

A habit is a behavior performed repeatedly—so much so that it's now performed without thinking.
A habit is automatic and feels right at the time. The person is so used to it, that it feels good when doing it, and to do something else would require effort and make them feel uncomfortable. They might have second thoughts afterward though.
Some habits are good and extremely helpful by saving you a lot of energy. The brain is 2% of the body's weight but consumes 20% of your daily energy. Because thinking and concentration require a lot of energy, your mind is built to save it through developing unconscious habits.
Some habits are bad for you, so you desire to change them.
All habits offer some reward, even if it is only temporary.
Habits are formed around what you are familiar with and what you know, even habits you don’t necessarily like.

The three steps of a habit

Cue (trigger): First, a cue or trigger tells the brain to go into automatic mode, using previously learned habitual behavior. Cues can be things like seeing a candy bar or a television commercial, being in a certain place at a certain time of day, or just seeing a particular person. Time pressure can trigger a routine. An overwhelming atmosphere can trigger a routine. Simply put, something reminds you to behave a certain way.
Routine: The routine follows the trigger. A routine is a set of physical, mental, and/or emotional behaviors that can be incredibly complex or extremely simple. Some habits, such as those related to emotions, are measured in milliseconds.
Reward: The final step is the reward, which helps your brain figure out whether a particular activity is worth remembering for the future. Rewards can range from food or drugs that cause physical sensations to joy, pride, praise, or personal self-esteem.

Bad habits in a business environment

Habits aren't just for individuals. All organizations have good and bad institutional habits. However, some organizations deliberately design their habits, while others just let them evolve without forethought, possibly through rivalries or fear. These are some organizational habit examples:

Always being late with reports
Working alone or working in groups when the opposite is appropriate
Being triggered by excess pressure from the boss
Not caring about declining sales
Not cooperating among a sales team because of excess competition
Allowing one talkative person to dominate a meeting

Learn about open organizations Download resources Join the community What is an open organization? How open is your organization? A step-by-step plan to change a habit

Habits don't have to last forever. You can change your own behavior. First, remember that many habits can not be changed concurrently. Instead, find a keystone habit and work on it first. This produces small, quick rewards. Remember that one keystone habit can create a chain reaction.

Here is a four-step framework you can apply to changing any habit, including habits related to open organization principles.

Step one: identify the routine

Identify the habit loop and the routine in it (for example, when an important challenge comes up that you can't address alone). The routine (the behaviors you do) is the easiest to identify, so start there. For example: "In my organization, no one discusses problems with anyone. They just give up before starting." Determine the routine that you want to modify, change, or just study. For example: "Every time an important challenge comes up, I should discuss it with people and try to develop a community of like-minded people who have the skills to address it."

Step two: experiment with the rewards

Rewards are powerful because they satisfy cravings. But, we're often not conscious of the cravings that drive our behavior. They are only evident afterward. For example, there may be times in meetings when you want nothing more than to get out of the room and avoid a subject of conversation, even though down deep you know you should figure out how to address the problem.

To learn what a craving is, you must experiment. That might take a few days, weeks, or longer. You must feel the triggering pressure when it occurs to identify it fully. For example, ask yourself how you feel when you try to escape responsibility.

Consider yourself a scientist, just doing experiments and gathering data. The steps in your investigation are:

After the first routine, start adjusting the routines that follow to see whether there's a reward change. For example, if you give up every time you see a challenge you can't address by yourself, the reward is the relief of not taking responsibility. A better response might be to discuss the issue with at least one other person who is equally concerned about the issue. The point is to test different hypotheses to determine which craving drives your routine. Are you craving the avoidance of responsibility?
After four or five different routines and rewards, write down the first three or four things that come to mind right after each reward is received. Instead of just giving up in the face of a challenge, for instance, you discuss the issue with one person. Then, you decide what can be done.
After writing about your feeling or craving, set a timer for 15 minutes. When it rings, ask yourself whether you still have the craving. Before giving in to a craving, rest and think about the issue one or two more times. This forces you to be aware of the moment and helps you later recall what you were thinking about at that moment.
Try to remember what you were thinking and feeling at that precise instant, and then 15 minutes after the routine. If the craving is gone, you have identified the reward.

Step three: isolate the cue or trigger

The cue is often hard to identify because there's usually too much information bombarding you as your behaviors unfold. To identify a cue amid other distractions, you can observe four factors the moment the urge hits you:

Location: Where did it occur? ("My biggest challenges come out in meetings.")

Time: When did it occur? ("Meetings in the afternoon, when I'm tired, are the worst time, because I'm not interested in putting forth any effort.")

Feelings: What was your emotional state? ("I feel overwhelmed and depressed when I hear the problem.")

People: Who or what type of people were around you at the time, or were you alone? ("In the meetings, most other people don't seem interested in the problem either. Others dominate the discussion.")

Step four: have a plan

Once you have confirmed the reward driving your behavior, the cues that trigger it, and the behavior itself, you can begin to shift your actions. Follow these three easy steps:

First, plan for the cue. ("In meetings, I'm going to look for and focus my attention on important problems that come up.")
Second, choose a behavior that delivers the same reward but without the penalties you suffer now. ("I'm going to explore a plan to address that problem and consider what resources and skills I need to succeed. I'm going to feel great when I create a community that's able to address the problem successfully.")
Third, make the behavior a deliberate choice each and every time, until you no longer need to think about it. ("I'm going to consciously pay attention to major issues until I can do it without thinking. I might look at agendas of future meetings, so I know what to expect in advance. Before and during every meeting, I will ask why should I be here, to make sure I'm focused on what is important."

Plan to avoid forgetting something that must be done

To successfully start doing something you often forget, follow this process:

Plan what you want to do.
Determine when you want to complete it.
Break the project into small tasks as needed.
With a timer or daily planner, set up cues to start each task.
Complete each task on schedule.
Reward yourself for staying on schedule.

Habit change

Change takes a long time. Sometimes a support group is required to help change a habit. Sometimes, a lot of practice and role play of a new and better routine in a low-stress environment is required. To find an effective reward, you need repeated experimentation.

Sometimes habits are only symptoms of a more significant, deeper problem. In these cases, professional help may be required. But if you have the desire to change and accept that there will be minor failures along the way, you can gain power over any habit.

In this article, I've used examples of community development using the cue-routine-reward process. It can equally be applied to the other open organization principles. I hope this article got you thinking about how to manage habits through knowing how habits work, taking steps to change habits, and making plans to avoid forgetting things you want done. Whether it's an open organization principle or anything else, you can now diagnose the cue, the routine, and the reward. That will lead you to a plan to change a habit when the cue presents itself.

In my next article, I'll look at habits through the lens of Art Markman's thoughts on Smart Thinking.

Follow these steps to implement habits that support open culture and get rid of those that don't.

Image by:

opensource.com

The Open Organization What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How I use LibreOffice keyboard shortcuts

Wed, 06/15/2022 - 15:00

How I use LibreOffice keyboard shortcuts Jim Hall Wed, 06/15/2022 - 03:00 Register or Login to like Register or Login to like

I have used word processing software for as long as I can remember. When word processors moved from direct formatting to leveraging styles to change how text appears on the page, that was a big boost to my writing.

LibreOffice provides a wide variety of styles that you can use to create all kinds of content. LibreOffice applies paragraph styles to blocks of text, such as body text, lists, and code samples. Character styles are similar, except that these styles apply to inline words or other short text inside a paragraph. Use the View -> Styles menu, or use the F11 keyboard shortcut, to bring up the Styles selector.

Image by:

(Jim Hall, CC BY-SA 40)

Using styles makes writing longer documents much easier. Consider this example: I write a lot of workbooks and training material as part of my consulting practice. A single workbook might be 40 or 60 pages long, depending on the topic, and can include a variety of content such as body text, tables, and lists. Some of my technical training material may also include source code examples.

I have a standard training set that I offer clients, but I do custom training programs too. When working on a custom program, I might start by importing text from another workbook, and working from there. Depending on the client, I might also adjust the font and other style elements to match the client's style preferences. For other materials, I might need to add source code examples.

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources

To enter sample source code using direct formatting, I need to set the font and adjust the margins for each code block in the workbook. If I later decide that my workbook should use a different font for body text or source code samples, I would need to go back and change everything. For a workbook that includes more than a few code samples, this could require several hours to hunt down every source code example and adjust the font and margins to match the new preferred format.

However, by using styles, I can update the definition once to use a different font for the Text Body style, and LibreOffice Writer updates my document everywhere that uses the Text Body style. Similarly, I can adjust the font and margins for the Preformatted Text style, and LibreOffice Writer applies that new style to every source code example with the Preformatted Text style. This is the same for other blocks of text, including titles, source code, lists, and page headers and footers.

I recently had the bright idea to update the LibreOffice keyboard shortcuts to streamline my writing process. I've redefined Ctrl+B to set character style Strong Emphasis, Ctrl+I to set character style Emphasis, and Ctrl+Space to set No Character Style. This makes my writing much easier, as I don't have to pause my writing so I can highlight some text and select a new style. Instead, I can use my new Ctrl+I keyboard shortcut to set the Emphasis character style, which is essentially italics text. Anything I type after that uses the Emphasis style, until I press Ctrl+Space to reset the character style back to the default No Character Style.

Image by:

(Jim Hall, CC BY-SA 40)

If you want to set this yourself, use Tools > Customize, then click on the Keyboard tab to modify your keyboard shortcuts.

Image by:

(Jim Hall, CC BY-SA 40)

LibreOffice makes technical writing much easier with styles. And by leveraging keyboard shortcuts, I've streamlined how I write, keeping me focused on the content that I'm meant to deliver, and not its appearance. I might change the formatting later, but the styles remain the same.

Keyboard shortcuts keep me focused on the content that I'm meant to deliver, and not its appearance.

Image by:

Opensource.com

LibreOffice What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

A beginner's guide to cloud-native open source communities

Wed, 06/15/2022 - 15:00

A beginner's guide to cloud-native open source communities Anita Ihuman Wed, 06/15/2022 - 03:00 Register or Login to like Register or Login to like

Some people think the cloud-native ecosystem has a high barrier to entry. At first glance, that looks like a logical assumption. Some of the technologies used in cloud-native projects are complex and challenging, if you're not familiar with them, so you might think you need proven expertise to get involved.

However, looks can be deceiving. This article provides a detailed roadmap to breaking into the cloud-native ecosystem as an open source contributor. I'll cover the pillars of cloud-native architecture, the Cloud Native Computing Foundation (CNCF), and ways to earn more.

Most importantly, after grounding you in the basics of cloud-native practices and communities, the article provides a three-step guide for getting started.

What is cloud native?

A program is cloud native when it's explicitly developed to be integrated, hosted, and run on a cloud computing platform. Such an application possesses the cloud's inherent characteristics, such as portability, modularity, and isolation, and it adapts to the cloud deployment models of cloud service providers (CSPs).

Cloud computing is a general term for anything that delivers hosted services over the internet. It usually implies clusters of computers, a distributed file system, and containers. A cloud can be private or public. Cloud computing comes in three major categories: Platform-as-a-Service (PaaS), Software-as-a-Service (SaaS), and Infrastructure-as-a-Service (IaaS).

From a business perspective, cloud computing means that rather than investing heavily in databases, software, and hardware, companies opt for IT services over the internet, or cloud, and pay for them as they use them.

Cloud-native infrastructure

Cloud-native infrastructure includes datacenters, operating systems, deployment pipelines, configuration management, and any system or software needed to complete the lifecycle of applications. These solutions enable engineers to make rapid, high-impact modifications with little effort, implement new designs, and execute scalable applications on public, personal, and hybrid clouds.

Cloud-native architecture

Cloud architecture is a system designed to utilize cloud services. It leverages the cloud development model's distributed, reliable, scalable, and flexible nature. Networking, servers, datacenters, operating systems, and firewalls are abstracted. It enables enterprises to design applications as loosely coupled components and execute them on dynamic platforms using microservices architecture.

There are a few technologies that can be considered pillars of cloud-native architecture.

Microservices is an architectural system in which software systems are made up of small, independent services that communicate through well-defined application programming interfaces (APIs). This development method makes applications faster to develop and more scalable, encouraging innovation and accelerating time-to-market for new features. Microservices enable communication among applications using RESTful APIs and support independent deployment, updates, scaling, and restarts.

DevOps refers to the philosophy, practices, and tools that promote better communication and collaboration between application development and IT operations teams. The benefits of DevOps processes include:

Enabling automated release pipelines and integration
Ensuring quick deployment to production
Encouraging collaboration between development and other departments

Continuous integration and continuous delivery (CI/CD) refers to a set of practices that encompass the culture, operating principles, and procedures for software development. CI/CD practices focus on automation and continuous monitoring throughout the lifecycle of apps, from integration and testing phases to delivery and deployment. The benefits of CI/CD include:

Enabling frequent releases
Shipping software more quickly
Receiving prompt feedback
Reducing the risk of release

A container is a software package that contains all of the components (binaries, libraries, programming language versions, and so on) needed to run in any environment, making it possible for them to run on a laptop, in the cloud, or in an on-premises datacenter. A container is the optimal carrier for microservices.

More on Kubernetes What is Kubernetes? Free online course: Containers, Kubernetes and Red Hat OpenShift technical over… eBook: Storage Patterns for Kubernetes Test drive OpenShift hands-on An introduction to enterprise Kubernetes How to explain Kubernetes in plain terms eBook: Running Kubernetes on your Raspberry Pi homelab Kubernetes cheat sheet eBook: A guide to Kubernetes for SREs and sysadmins Latest Kubernetes articles What is CNCF?

CNCF is a Linux Foundation project founded in 2015 to help advance container technology and align the tech industry around its evolution. A sub-organization of the Linux Foundation, it consists of a collection of open-source projects supported by ongoing contributions courtesy of a vast, vibrant community of programmers.

Founding members of the CNCF community include companies like Google, IBM, Red Hat, Docker, Huawei, Intel, Cisco, and others. Today, CNCF is supported by over 450 members. Its mission is to foster and sustain open source, vendor-neutral projects around cloud native.

Perhaps the most well-known project to come from CNCF is Kubernetes. The project was contributed to the Linux Foundation by Google as a seed technology and has since proven its worth by automating container-technology-based applications' deployment, scaling, and management.

Learn cloud native

There are numerous resources to help you understand the basics of cloud-native architecture and technologies. You could start with these:

Cloud-native glossary: The Cloud Native Glossary, a project led by the CNCF Business Value Subcommittee, is a reference for common terms when talking about cloud-native applications. It was put together to help explain cloud-native concepts in clear and straightforward language, especially for those without previous technical knowledge.

[ More resources: Kubernetes glossary ]

Cloud-native communities

Outside the CNCF projects, some other cloud-native communities and initiatives aim toward sustaining and maintaining these cloud-based projects. They include:

Special interest groups (SIGs) and working groups (WGs): SIGs are formed around different cloud-native elements in training and development. These groups meet weekly and discuss the community activity. You could also start a new SIG.
Cloud-native community groups: There are numerous meetup groups focused on expanding the cloud-native community on a global scale. Many have regular meetings that you can be a part of.
CNCF TAG Network: TAG (for Technical Advisory Group) Network's mission is to enable widespread and successful development, deployment, and operation of resilient and intelligent network systems in cloud-native environments

Free training courses

Some training courses are available from the Linux Foundation to give beginners preliminary knowledge of cloud technologies. Some of these courses include:

Paid certifications

There are also paid certification programs from CNCF that test and boost your knowledge of cloud-native technologies from zero to expert. These certifications have proven to be a great addition to practitioners' careers.

Kubernetes and Cloud-Native Associate (KCNA): The KCNA certification prepares candidates to work with cloud-native technologies and pursue further CNCF certifications like CKA, CKAD, and CKS (see below).
Certified Kubernetes Application Developer (CKAD): A Kubernetes-certified application developer can design, build, configure, and expose cloud-native applications for Kubernetes, define application resources, and use core primitives to create, monitor, and troubleshoot scalable applications and tools in Kubernetes.
Certified Kubernetes Administrator (CKA): A certified Kubernetes administrator has demonstrated the ability to do the basic installation, configuration, and management of production-grade Kubernetes clusters. They understand critical concepts such as Kubernetes networking, storage, security, maintenance, logging and monitoring, application lifecycle, troubleshooting, and API primitives. A CKA can also establish primary use cases for end users.
Certified Kubernetes Security Specialist (CKS): Obtaining a CKS demonstrates possession of the requisite abilities to secure container-based applications and Kubernetes platforms during build, deployment, and runtime. A CKS is qualified to perform these tasks in a professional setting.
Cloud Foundry Certified Developer (CFCD) : CFCD certification is ideal for candidates who want to validate their skill set using the Cloud Foundry platform to deploy and manage applications.
FinOps Certified Practitioner (FOCP) : An FOCP will bring a strong understanding of FinOps, an operational framework combining technology, finance, and business to realize business and financial goals through cloud transformation. The practitioner supports and manages the FinOps lifecycle and manages the cost and usage of cloud resources in an organization.

All of these can be found in the Linux Foundation training and certification catalog.

Start your cloud-native journey in three steps

Now that you're equipped with all this information, you can choose the direction you want to take. If you're overwhelmed by the options, just go step by step:

Understand the basics: Due to the complex nature of most cloud-native technologies, someone new to this ecosystem should have preliminary knowledge of the core concepts. Basic knowledge of containerization, orchestration, cloud/infrastructure, and both monolithic and microservices architecture is a good start.
Identify a cloud-native community or project: There are over 300 cloud-native communities that exist today. It is a lot easier to break into the cloud-native community through these established groups. While some of these communities are initiatives to sustain cloud-native projects, others have projects that offer cloud-native services. You can begin your journey by participating in any of these communities. Research the groups and projects that align with your interest, then follow the onboarding steps and get familiar with the projects behind them.
Find a niche within the community: Since most cloud-native communities are open source, the diverse skill of the community comes in handy. Explore the various opportunities that align with your skills and interest, whether that's frontend, backend, developer relations (DevRel), operations, documentation, program management, or community relations. It is easier to contribute to cloud-native projects with a well-defined niche according to your skills and experiences.

You now have a basic understanding of the cloud-native ecosystem, both from a technological and community point of view. You can further extend your knowledge now and get involved. And once you do, remember to share your journey with others in the spirit of open source!

Start participating in the cloud-native ecosystem, even if you're a complete beginner.

Image by:

Opensource.com

Kubernetes Cloud Education Containers Community management What to read next How to contribute to Kubernetes if you have a full-time job How I became a Kubernetes maintainer in 4 hours a week This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

opensource.com

Pages