Open-source News

12 Practical Examples of Linux Grep Command

Tecmint - Wed, 11/16/2022 - 16:20
The post 12 Practical Examples of Linux Grep Command first appeared on Tecmint: Linux Howtos, Tutorials & Guides .

Have you ever been confronted with the task of looking for a particular string or pattern in a file, yet have no idea where to start looking? Well then, here is grep to the

The post 12 Practical Examples of Linux Grep Command first appeared on Tecmint: Linux Howtos, Tutorials & Guides.

How open source powers innovation

opensource.com - Wed, 11/16/2022 - 16:00
How open source powers innovation Gordon Haff Wed, 11/16/2022 - 03:00

Where do people come together to make cutting-edge invention and innovation happen?

The corporate lab

One possible answer is the corporate research lab. More long-term focused than most company product development efforts, corporate labs have a long history, going back to Thomas Edison's Menlo Park laboratory in New Jersey. Perhaps most famous is Bell Labs' invention of the transistor—although software folks may associate it more with Unix and the C programming language.

But corporate laboratories have tended to be more associated with dominant firms that could afford to let large staffs work on very forward-looking and speculative research projects. After all, Bell Labs was born of the AT&T telephone monopoly. Corporate labs also aren't known for playing with their counterparts elsewhere in industry. Even if their focus is long-term, they're looking to profit from their IP eventually, which also means that their research is often rooted in technologies commercially relevant to their business.

A long-term focus is also hard to maintain if a company becomes less dominant or less profitable. It's a common pattern that, over time, research and development in these labs start to look like an extension of the more near-term focused product development process. Historically, the focus of corporate labs has been to develop working artifacts and benchmark themselves against others in the industry by the size of their patent portfolio—although recently there has been more publishing of results than in years past.

The academy

Another engine of innovation is the modern research university. In the US, at least, the university as a research institution primarily emerged in the late 19th century, although some such schools had colonial-era roots, and the university research model truly accelerated after World War II.

Academia is both collaborative and siloed: collaborative in that professors will often collaborate with colleagues around the globe, siloed in that even colleagues at the same institution may not collaborate much if they're not in the same specialty. Although IP may not be protected as vigorously in a university setting as a corporate one, it can still be a consideration. The most prominent research universities make large sums of money from IP licensing.

The primary output of the academy is journal papers. This focus on publication, sometimes favoring quantity over quality, comes with a famous phrase attached: publish or perish. Furthermore, while the content of papers is far more about novel results than commercial potential, that has a flip side: research can end up being quite divorced from real-world concerns and use cases. Among other consequences, this is often not ideal for the students working on that research if they move on to industry after graduation.

Open source software

What of open source software? Certainly, major projects are highly collaborative. Open source software also supports the kind of knowledge diffusion that, throughout history, has enabled the spread of at least incremental advances in everything from viticulture to blast furnace design in 19th-century England. That said, open source software, historically, had a reputation primarily for being good enough and cheaper than proprietary software. That's changed significantly, especially in areas like working with large volumes of data and the whole cloud-native ecosystem. This development probably represents how collaboration has trumped a tendency towards incrementalism in many cases. IP concerns are primarily handled in open source software—occasional patent and license incompatibility issues notwithstanding.

The open source model has been less successful outside the software space. There are exceptions. There have been some wins in data, such as open government datasets and projects like OpenStreetMap—although data associated with many commercial machine learning projects, for example, is a closely guarded secret. The open instruction set architecture specification, RISC-V, is another developing success story. Taking a different approach from earlier open hardware projects, RISC-V seems to be succeeding where past projects did not.

Open source software is most focused on shipping code, of course. However, associated artifacts such as documentation and validated patterns for deploying code through GitOps processes are increasingly recognized as important.

The question

This raises an important question: How do you take what is good about these patterns for creating innovation? Specifically, how do you apply open source principles and practices as appropriate? That's what we've sought to accomplish with Red Hat Research.

Red Hat Research

Work towards what came to be Red Hat Research began in Red Hat's Brno office in the Czech Republic in the early 2010s. In 2018, the research program added a major academic collaboration with Boston University: the Red Hat Collaboratory. The goal was to advance research in emerging technologies in areas of joint interest, such as operating systems and hybrid cloud. The scope of projects that Red Hat and its academic partners collaborate on has since expanded considerably, although infrastructure remains a major focus.

The Collaboratory sponsors research projects led by collaborative teams of BU faculty and Red Hat engineers. It also supports fellowships and internship programs for students and organizes joint talks and workshops.

In addition to activities like those associated with the Collaboratory, Red Hat Research now publishes a quarterly magazine (sign up for your free print or PDF subscription!), runs Red Hat Research Days events, and has regional Research Interest Groups (RIGs) open to a wide range of participants. Red Hat engineers also teach classes and work with students and faculty to produce prototypes, demos, and jointly authored research papers as well as code.

More Linux resources Linux commands cheat sheet Advanced Linux commands cheat sheet Free online course: RHEL technical overview Linux networking cheat sheet SELinux cheat sheet Linux common commands cheat sheet What are Linux containers? Our latest Linux articles What sorts of open source research projects?

Red Hat research participates in research projects with universities around the world. These are just a few recent examples from North American partnerships.

  • Machine learning for cloud ops: Continuous Integration/Continuous Development (CI/CD) environments move at a breakneck pace using a wide variety of components. Relying on human experts to manage these processes is unreliable, costly, and often not scalable. The AI for Cloud Ops project, housed at the BU Collaboratory, aims to provide AI-driven analytics and heavily automated "ops" functionality to improve performance, resilience, and security in the cloud without incurring high operation costs. The project's principal investigator, BU professor Ayse Coskun, was interviewed about the project for a recent Red Hat Research Quarterly.
     
  • Rust memory safety: Rust is a relatively new language that aims to be suitable for low-level programming tasks while dealing with the significant lack of memory safety features that a language like C suffers from. The problem: Rust has an "unsafe" keyword that suspends some of the compiler's memory checks within a specified code block. There are often good reasons to use this keyword, but it's now up to the developer to ensure their code is memory safe, which sometimes does not work out so well. Researchers at Columbia University and Red Hat engineers are exploring methods for the automated detection of vulnerabilities in Rust, which can then be used to help automate the development, testing, and debugging of real-world software.
     
  • Linux-based unikernel: A unikernel is a single bootable image consisting of user code linked with additional components that provide kernel-level functionality, such as opening files. The resulting program can boot and run on its own as a single process, in a single address space, at an elevated privilege level without a conventional operating system. This fusion of application and kernel components is very lightweight and can have performance and security advantages. A considerable team of BU and Red Hat engineers has been working on adding unikernel capabilities into the same source code tree as regular Linux.
     
  • Image provenance analysis: It's increasingly easy to create composite or otherwise altered images to spread malicious, untrue information intended to influence behavior. A research collaboration among the University of Notre Dame, Loyola University, and Red Hat is using graph clustering and other techniques to develop scalable mechanisms for detecting these images. Among the project's goals is to see whether there's also the potential to look at metadata or other sources of information. The upstream pyIFD project has come out of this research.
Only the beginning

The above is just a small sample of the many innovative research projects Red Hat Research is involved with. I encourage you to head over to the Red Hat Research projects page to see all the other exciting work going on.

This article is adapted from a post on the Red Hat Research blog and is republished with permission.

If you're looking for the next big thing in computing technology, start the search in open source communities.

Education What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How to address challenges with community metrics

opensource.com - Wed, 11/16/2022 - 16:00
How to address challenges with community metrics Georg Link Wed, 11/16/2022 - 03:00

The previous two articles in this series looked at open source community health and the metrics used to understand it. They showed examples of how open source communities have measured at their health through metrics. This final article brings those ideas together, discussing the challenges of implementing community health metrics for your own community.

Organizational challenges

First, you must decide which metrics you want to examine. This requires understanding your questions about reaching your goals as a community. The metrics relevant to you are those that can answer your questions. Otherwise, you risk being overwhelmed by the amount of data available.

Second, you need to anticipate how you want to react to the metrics. This is about making decisions based on what your data shows you. For example, this includes managing engagement with other community members, as discussed in previous articles.

Third, you must differentiate between good and bad results in your metrics. A common pitfall is to compare your community to other communities, but the truth is that every community works and behaves differently. You can't necessarily even compare metrics within the same project. For example, you may be unable to compare the number of commits in repositories within the same project because one may be squashing commits while the other might have hundreds of micro commits. You can establish a baseline of where you are and have been and then see whether you've improved over time.

Privacy

The final organizational challenge I want to discuss is Personally Identifiable Information (PII) concerns. One of open source's core values and strengths is the transparency of how contributors work. This means everyone has information about who's engaged, including their name, email address, and possibly other information. There are ethical considerations about using that data.

In recent years, regulations like the European General Data Protection Regulation (GDPR) have defined legal requirements for what you can and cannot do with PII data. The key question is whether you need to ask everyone's permission to process their data. This is an opt-in strategy. On the other hand, you might choose to use the data and provide an opt-out process.

This distinction is important. For instance, suppose you're providing metrics and dashboards as a service to your community. In an effort to improve the community, you might make the case that the (already publicly available) information has greater value for the community once it's processed. Either way, make it clear what data you use and how you use it.

More on security The defensive coding guide 10 layers of Linux container security SELinux coloring book More security articles Technical challenges

Where is your community data being collected? To answer this, consider all the places and platforms your community is engaging in. This includes the software repository, whether it's GitLab, GitHub, Bitbucket, Codeberg, or just a mailing list and a Git server. It may also include issue trackers, a change request workflow system like Gerrit, or a wiki.

But don't stop at the software development interactions. Where else does the community exist? These could be forums, mailing lists, instant messaging channels, question-and-answer sites, or meetups. There's a lot of activity in open source communities that doesn't strictly involve software development work but that you want to recognize in your metrics. These non-coding activities may be hard to track automatically, but you should pay special attention to them or risk ignoring important community members.

With all of these considerations addressed, it's time to take action.

1. Retrieve the data

Once you've identified the data sources, you must get the data and make it useful. Collecting raw data is almost always the easiest step. You have logs and APIs for that. Once set up, the (hopefully occasional) main challenge is when APIs and log formats change.

2. Data enrichment

Once you have the data, you probably need to enrich it.

First, you must unify the data. This step includes converting data into a standard format, which is no small feat. Just think of all the different ways to express a simple date. The order of the year, month, and day varies between regions; dates may use dots, slashes, or other symbols, or they can be expressed in the Unix epoch. And that's just a timestamp!

Whatever your raw data format is, make it consistent for analysis. You also want to determine the level of detail. For example, when you look at a Git log, you may only be interested in when a commit was made and by whom, which is high-level information. Then again, maybe you also want to know what files were touched or how many lines were added and removed. That's a detailed view.

You may also want to track metadata about different contributions. This may involve adding contextual information on how the data was collected or the circumstances under which it was created. For example, you could tag contributions made during the Hacktoberfest event.

Finally, standardize the data into a format suitable for analysis and visualization.

When you care about who is active in your community (and possibly what organizations they work for), you must pay special attention to identity. This can be a challenge because contributors may use different usernames and email addresses across the various platforms. You need a mechanism to track an individual by several online identifiers, such as an issue tracker, mailing list, and chat.

You can also pre-process data and calculate metrics during the data enrichment phase. For example, the original raw data may have a timestamp of when an issue was opened and closed, but you really want to know the number of days the issue has been open. You may also have categorization criteria for contributions, such as identifying which contribution came from a core contributor, who's been doing a lot in a project, how many "fly by" contributors show up and then leave, and so on. Doing these calculations during the enrichment phase makes it easier to visualize and analyze the data and requires less overhead at later stages.

3. Make data useful

Now that your data is ready, you must decide how to make it useful. This involves figuring out who the user of the information is and what they want to do with it. This helps determine how to present and visualize the data. One thing to remember is that the data may be interesting but not impactful by itself. The best way to use the data is to make it part of a story about your community.

You can use the data in two ways to tell your community story:

  • Have a story in mind, and then verify that the data supports how you perceive the community. You can use the data as evidence to corroborate the story. Of course, you should look for evidence that your story is incorrect and try to refute it, similar to how you make a scientific hypothesis.
  • Use data to find anomalies and interesting developments you wouldn't have otherwise observed. The results can help you construct a data-driven story about the community by providing a new perspective that perhaps has outgrown casual observation.
Solve problems with open source

Before you address the technical challenges, I want to give you the good news that you're in open source technology, and others have already solved many of the challenges you're facing. There are several open source solutions available to you:

  • CHAOSS GrimoireLab: The industry standard and enterprise-ready solution for community health analytics.
  • CHAOSS Augur: A research project with a well-defined data model and bleeding-edge functionality for community health analytics.
  • Apache Kibble: The Apache Software Foundations' solution for community health analytics.
  • CNCF Dev Analytics: CNCF's GitHub statistics for community health analytics.

To overcome organizational challenges, rely on the CHAOSS Project, a community of practice around community health.

The important thing to remember is that you and your community aren't alone. You're a part of a larger community that's constantly growing.

I've covered a lot in the past three articles. Here's what I hope you take away:

  • Use metrics to identify where your community needs help.
  • Track whether specific actions lead to changes.
  • Track metrics early, and establish a baseline.
  • Gather the easy metrics first, and get more sophisticated later.
  • Present metrics in context. Tell a story about your community.
  • Be transparent with your community about metrics. Provide a public dashboard and publish reports.

Consider this advice for addressing the organizational and technical challenges of implementing community health metrics for your own community.

Community management Security and privacy What to read next 5 metrics to track in your open source community What do you do with community metrics? This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. 125 points Madrid, Spain

Emilio has 5+ years of experience in business and marketing across different industries and countries. His academic background includes a Bachelors degree in Business from Americana University and a Masters degree in Marketing and Sales from the EAE Business School.

Emilio is currently the Marketing Specialist at Bitergia in Madrid, Spain. He creates content and writes about open source community, metrics, and analytics.

Outside of work, his hobbies include traveling, playing video games, and boxing.

LinkedIn profile: https://www.linkedin.com/in/emilio-galeano-gryciuk/

Open Minded Author Contributor Club Register or Login to post a comment.

10 Most IT Skills In Demand To Master This Year

Tecmint - Wed, 11/16/2022 - 13:20
The post 10 Most IT Skills In Demand To Master This Year first appeared on Tecmint: Linux Howtos, Tutorials & Guides .

With many companies and organizations having more IT (Information technology) job openings, there is no denying that many are enticed to dive into the industry. This is because the IT industry is one of

The post 10 Most IT Skills In Demand To Master This Year first appeared on Tecmint: Linux Howtos, Tutorials & Guides.

Red Hat EMEA Digital Leaders Awards 2022: And the regional winners are...

Red Hat News - Wed, 11/16/2022 - 08:00
<p><span><span><span><span><span><span>In my </span></span></span></span></span></span><a href="https://www.redhat.com/en/blog/transforming-world-announcing-our-red-hat-digital-leaders-2022"><span><span><span><span><span><span><span><span>previous post</span>&

Innovative HID-BPF Expected To Land In Linux 6.2

Phoronix - Wed, 11/16/2022 - 04:30
Adding to the growing list of changes expected to be sent in during the Linux 6.2 merge window next month is HID-BPF. This is the Red Hat led effort around using eBPF within the HID subsystem for input devices...

Pages