opensource.com

Subscribe to opensource.com feed
Updated: 1 hour 2 min ago

Happy Ada Lovelace Day! Here are our favorite tech books for kids

Tue, 10/11/2022 - 23:00
Happy Ada Lovelace Day! Here are our favorite tech books for kids Lauren Pritchett Tue, 10/11/2022 - 11:00

Tech is everywhere! And it's important to remember, everyone is a beginner once. The industry is continually growing, looking for new folks to add valuable perspectives. How do we inspire our next batch of tech contributors?

In honor of Ada Lovelace Day, we asked our community of writers if they had any resources or books they recommend for young programmers or enthusiasts. Here's what they had to share.

Start from Scratch

I recommend Coding Games in Scratch: A Step-by-Step Visual Guide to Building Your Own Computer Games by Jon Woodcock and Code Your Own Games!: 20 Games to Create with Scratch by Max Wainewright. Being a LEGO family, all of my kids have spent time with the LEGO BOOST set. It involves a minimal amount of coding but is really focused on some of the foundational skills involved in code like logical, sequential thinking and creative problem solving.

Katie Richards

My 4-year-old loves ScratchJR. They have an official book, The Official ScratchJr Book: Help Your Kids Learn to Code by Marina Umaschi Bers + Mitchel Resnick. These fun board books about science are perfect for babies.

—Sanchita Narula

Probably the most important thing is that kids love to think—and to think critically. There are two books that help to achieve this in a fun way and also inspire kids to love math and to question everything: The Hitchhiker's Guide to the Galaxy by Douglas Adams and Humble Pi by Matt Parker. Scratch is ideal for a student's initial exposure to coding because understanding basic concepts is usually easier visually.

Peter Czanik

Programming and development Red Hat Developers Blog Programming cheat sheets Try for free: Red Hat Learning Subscription eBook: An introduction to programming with Bash Bash shell scripting cheat sheet eBook: Modernizing Enterprise Java Programming for the whole family

My favorite books for young programmers come from the publisher, No Starch Press:

  • Electronics for Kids: Play with Simple Circuits and Experiment with Electricity! by Oyvind Nydal Dahl
  • Teach Your Kids to Code: A Parent-Friendly Guide to Python Programming by Bryson Payne
  • Raspberry Pi Projects for Kids: Create an MP3 Player, Mod Minecraft, Hack Radio Waves, and More! by Dan Aldred
  • Python for Kids: A Playful Introduction To Programming by Jason R. Briggs

—Michelle Colon

Learn to code by coding

I don't know if I was inspired by fiction books. But the Apple II computer was very popular when I was growing up. Our parents bought an Apple II computer for my brother and me to use, and we taught ourselves how to write programs in Applesoft BASIC. We also had several "teach yourself programming" books that we read cover to cover. I learned a lot by reading, then trying it out on my own, then figuring out how to make it better.
 
I encourage other developers to do the same. Pick up a book about programming, and just try it out. Make programs that you find interesting; don't worry about what others think. Write a number guessing game, or a turn-based text adventure game, or a puzzle game. Or experiment by writing command line tools that you find useful. A good place to start is to write your own versions of some of the Linux command line tools like cat or more. What matters most is that you try something, and then you make it better. That's a great way to teach yourself about programming.

Jim Hall

^^ In response to Jim

That's an excellent point! Now that I think back, my earliest formative exposure to programming was more or less the same. My parents couldn't easily afford to buy me and my brother tapes of popular games (yes, tapes... for those not old enough to have experienced it first hand, games were usually sold in a clear plastic bag or bubble package containing a paper manual and a cassette tape). Instead, we had subscriptions to home computing magazines which included articles interspersed with printed source code for programs, often games. There were also paperback compilations of these we sometimes received as gifts. If we wanted to play a game on the computer, we had to type it all in first (and inevitably debug it too), so we'd take turns and check each other's work.
 

I doubt that approach would have worked for later generations, but it was certainly an incentive to fast-track our exposure to software development. It naturally followed that as soon as we got bored with a game, we didn't just pop in something else. We were familiar enough with the code, from hours of painstakingly typing it in and fixing all our errors (sometimes even finding and correcting mistakes in the printed version!), that the obvious solution was to change the game in order to make it more fun. A war strategy game based in ancient Rome could be trivially turned into a space battle setting just by altering some names and terminology. A particular game mechanic which made things less enjoyable could be adjusted or removed to improve the experience and suit our personal tastes.

Little did I know back then I'd still be doing pretty much the same thing 40 years later, collaboratively editing and reviewing changes to source code. And now I can get paid for it!

Jeremy Stanley

All about Ada

My 7-year-old daughter is venturing into chapter books while my 3-year-old son is still into colorful board books that he can carry around the house. Both still adore picture books because they can really see themselves in the stories. I’m teaching them about Ada Lovelace in a way that they can grasp her impact. There are lots of books that tell her story. We own the following books: 

A bonus read I recommend to parents of young kids is How to Code a Sandcastle by Josh Funk and Sara Palacios. Pearl (how great is that name?!) is trying to teach her robot, Pascal, how to build a sandcastle. She encounters a few bugs along the way, but finally creates her palace thanks to problem-solving and cooperation.
 

From hands-on programming tutorials to fictional adventure novels, our contributors share their favorite books for programmers who are just starting out.

Image by:

Opensource.com

Programming Education What to read next How my students taught me to code Everything you need to know about Grace Hopper in six books It's Ada Lovelace Day! Learn the Ada programming language in 2021 Talking digital with Brian Kernighan This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Groovy vs Java: Connecting a PostgreSQL database with JDBC

Tue, 10/11/2022 - 15:00
Groovy vs Java: Connecting a PostgreSQL database with JDBC Chris Hermansen Tue, 10/11/2022 - 03:00

Lately, I've been looking at how Groovy streamlines the slight clunkiness of Java. This article examines some differences between connecting to a PostgreSQL database using JDBC in Java versus Groovy.

Install Java and Groovy

Groovy is based on Java and requires a Java installation. Both a recent/decent version of Java and Groovy might be in your Linux distribution's repositories, or you can install Groovy by following these instructions. A nice alternative for Linux users is SDKMan, which provides multiple versions of Java, Groovy, and many other related tools. For this article, I'm using SDK's releases of:

  • Java version 11.0.12-open of OpenJDK 11
  • Groovy version 3.0.8
Back to the problem

If you haven't already, please review this article on installing JDBC and this article on setting up PostgreSQL.

Whether you're using Java or Groovy, several basic steps happen in any program that uses JDBC to pull data out of a database:

  1. Establish a Connection instance to the database back end where the data resides.
  2. Using an SQL string, get an instance of a Statement (or something similar, like a PreparedStatement) that will handle the execution of the SQL string.
  3. Process the ResultSet instance returned by having the Statement instance execute the query, for example, printing the rows returned on the console.
  4. Close the Statement and Connection instances when done.

In Java, the correspondence between the code and the list above is essentially one-for-one. Groovy, as usual, streamlines the process.

More on Java What is enterprise Java programming? An open source alternative to Oracle JDK Java cheat sheet Red Hat build of OpenJDK Free online course: Developing cloud-native applications with microservices Fresh Java articles Java example

Here's the Java code to look at the land cover data I loaded in the second article linked above:

1  import java.sql.Connection;
2  import java.sql.Statement;
3  import java.sql.ResultSet;
4  import java.sql.DriverManager;
5  import java.sql.SQLException;
       
6  public class TestQuery {
       
7      public static void main(String[] args) {
       
8          final String url = "jdbc:postgresql://localhost/landcover";
9          final String user = "clh";
10          final String password = "carl-man";
       
11          try (Connection connection = DriverManager.getConnection(url, user, password)) {
       
12              try (Statement statement = connection.createStatement()) {
       
13                  ResultSet res = statement.executeQuery("select distinct country_code from land_cover");
14                  while (res.next()) {
15                      System.out.println("country code " + res.getString("country_code"));
16                  }
       
17              } catch (SQLException se) {
18                  System.err.println(se.getMessage());
19              }
20          } catch (SQLException ce) {
21              System.err.println(ce.getMessage());
22          }
23      }
24  }

Lines 1-5 are the necessary import statements for the JDBC classes. Of course, I could shorten this to import java.sql.* but that sort of thing is somewhat frowned-upon these days.

Lines 6-24 define the public class TestQuery I will use to connect to the database and print some of the contents of the main table.

Lines 7-23 define the main method that does the work.

Lines 8-10 define the three strings needed to connect to a database: The URL, the user name, and the user password.

Lines 11-22 use a try-with-resources to open the Connection instance and automatically close it when done.

Lines 12 -19 use another try-with-resources to open the Statement instance and automatically close it when done.

Line 13 creates the ResultSet instance handle the SQL query, which uses SELECT DISTINCT to get all unique values of country_code from the land_cover table in the database.

Lines 14-16 process the result set returned by the query, printing out the country codes one per line.

Lines 17-19 and 20-22 handle any SQL exceptions.

Groovy example

I'll do something similar in Groovy:

1  import groovy.sql.Sql
       
2  final String url = "jdbc:postgresql://localhost/landcover"
3  final String user = "me"
4  final String password = "my-password"
5  final String driver = "org.postgresql.Driver"
       
6  Sql.withInstance(url, user, password, driver) { sql ->
       
7      sql.eachRow('select distinct country_code from land_cover') { row ->
8        println "row.country_code ${row.country_code}"
9      }
10  }

Okay, that's a lot shorter–10 lines instead of 24! Here are the details:

Line 1 is the only import needed. In my view, not having to jump around JavaDocs for three different classes is a distinct advantage.

Lines 2-5 define the four strings needed to connect to a database using the Sql class. The first three are the same as for java.sql.Connection; the fourth names the driver I want.

Line 6 is doing a bunch of heavy lifting. The method call is Sql.withInstance(), similar to other uses of "with" in Groovy. The call:

  • Creates an instance of Sql (connection, statement, etc.).
  • Takes a closure as its final parameter, passing it the instance of Sql that it created.
  • Closes the instance of Sql when the closure exits.

Line 7 calls the eachRow() method of the Sql instance, wrapping the creation and processing of the result set. The eachRow() method takes a closure as its final argument and passes each row to the closure as it processes the returned lines of data from the table.

Groovy can simplify your life

For those of you whose day job involves scripting and relational databases, I think it's pretty obvious from the above that Groovy can simplify your life. A few closing comments:

  • I could have accomplished this similarly to the Java version; for example, instead of calling sql.eachRow(), I could have called sql.query(), which takes a closure as its last argument and passes a result set to that closure, at which point I would have probably used a while() as in the Java version (or maybe each()).
  • I could also read the resulting rows into a list, all at once, using a call to sql.rows(), which can transform the data in a manner similar to using .collect() on a list.
  • Remember that the SQL passed into the eachRow() (or query()) method can be arbitrarily complex, including table joins, grouping, ordering, and any other operations supported by the database in question.
  • Note that SQL can also be parametrized when using an instance of PreparedStatement, which is a nice way to avoid SQL injection if any part of the SQL comes in from outside the coder's sphere of control.
  • This is a good moment to point the diligent reader to the JavaDocs for groovy.sql.Sql.
Groovy resources

The Apache Groovy language site provides a good tutorial-level overview of working with databases, including other ways to connect, plus additional operations, including insertions, deletions, transactions, batching, pagination—the list goes on. This documentation is quite concise and easy to follow, at least partly because the facility it is documenting has itself been designed to be concise and easy to use!

This example demonstrates how Groovy streamlines the clunkiness of Java.

Image by:

Pixabay. CC0.

Java Databases What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Defining an open source AI for the greater good

Mon, 10/10/2022 - 15:00
Defining an open source AI for the greater good Stefano Maffulli Mon, 10/10/2022 - 03:00

Artificial intelligence (AI) has become more prevalent in our daily lives. While AI systems may intend to offer users convenience, there have been numerous examples of automated tools getting it wrong, resulting in serious consequences. What's happening in the AI system that leads to erroneous and harmful conclusions? Probably a dramatic combo of bad AI combined with a lack of human oversight. How do we as a society prevent AI ethics fails?

The open source community has had, for well over 20 years, clear processes for dealing with errors ("bugs") in software. The Open Source Definition firmly establishes the rights of developers and the rights of users. There are frameworks, licenses, and a legal understanding of what needs to be done. When you find a bug, you know who to blame, you know where to report it, and you know how to fix it. But when it comes to AI, do you have the same understanding of what you need to do in order to fix a bug, error, or bias?

In reality, there are many facets of AI that don't fit neatly into the Open Source Definition.

Establishing boundaries for AI

What's the boundary between the data that trains an AI system and the software itself? In many ways, AI systems are like black boxes: it isn't really understood what happens inside, and very little insight for how a system has reached a specific conclusion. You can't inspect the networks inside that are responsible for making a judgment call. So how can open source principles apply to these "black boxes" making automated decisions?

For starters, you need to take a step back and understand what goes into an AI's automated decision-making process.

The AI decision process

The AI process starts with collecting vast amounts of training data-data scraped from the internet, tagged and cataloged, and fed into a model to teach it how to make decisions on its own. However, the process of collecting a set of training data is itself problematic. It's a very expensive and time-consuming endeavor, so large corporations are better positioned to have the resources to build large training sets. Companies like Meta (Facebook) and Alphabet (Google) have been collecting people's data and images for a long, long time. (Think of all the pictures you've uploaded since before Facebook or even MySpace existed. I've lost track of all the pictures I've put online!) Essentially anything on the Internet is fair game for data collection, and today mobile phones are basically real-time sensors feeding data and images to a few mega-corporations and then to Internet-scrapers.

Examining the data going into the system is just scratching the surface. I haven't yet addressed the models and neural networks themselves. What's in an AI model? How do you know when you're chatting with a bot? How do you inspect it? How do you flag an issue? How do we fix it? How do we stop it in case it gets out of control?

It's no wonder that governments around the world are not only excited about AI and the good that AI could do, but also very concerned about the risks. How do we protect each other, and how do we ask for a fair AI? How do we establish not just rules and regulations, but also social norms that help us all define and understand acceptable behavior? We're just now beginning to ask these questions, and only just starting to identify all the pieces that need to be examined and considered.

To date, there aren't any guiding principles or guardrails to orient the conversation between stakeholders in the same way that, for instance, the GNU Manifesto and later the Open Source Definition provides. So far, everyone (corporations, governments, academia, and others) has moved at their own pace, and largely for their own self-interests. That's why the Open Source Initiative (OSI) has stepped forward to initiate a collaborative conversation.

More on AI Cheat sheet: AI glossary eBook: A practical guide to home automation using open source tools What is the Internet of Things? Open Source Stories: Road to AI Why deploy AI/ML workloads on OpenShift Latest AI and machine learning artilces Open Source Initiative

The Open Source Initiative has launched Deep Dive: AI, a three-part event to uncover the peculiarities of AI systems, to build understanding around where guardrails are needed, and to define Open Source in the context of AI. Here's a sampling of what the OSI has discovered so far.

Copyright

AI models may not be covered by copyright. Should they be?

Developers, researchers, and corporations share models publicly, some with an Open Source software license. Is that the right thing to do?

The output of AI may not be covered by copyright. That raises an interesting question: Do we want to apply copyright to this new kind of artifact? After all, copyleft was invented as a hack for copyright. Maybe this is the chance to create an alternative legal framework.

The release of the new Stable Diffusion model raises issues around the output of the models. Stable Diffusion has been trained on lots of images, including those owned by Disney. When you ask it to, for instance, create a picture of Mickey Mouse going to the US Congress, it spits out an image that looks exactly like Mickey Mouse in front of the US Capitol Building. That image may not be covered by copyright, but I bet you that the moment someone sells t-shirts with these pictures on it, Disney will have something to say about it.

No doubt we'll have a test case soon. Until then, delve more into the copyright conundrum in the Deep Dive: AI podcast Copyright, selfie monkeys, the hand of God.

Regulation

The European Union is leading the way on AI regulation, and its approach is interesting. The AI Act is an interesting read. It's still in draft form, and it could be some time before it is approved, but its legal premise is based on risk. As it stands now, EU legislation would require extensive testing and validation, even on AI concepts that are still in their rudimentary research stages. Learn more about the EU’s legislative approach in the Deep Dive: AI podcast Solving for AI’s black box problem.

Datasets

Larger datasets raise questions. Most of the large, publicly available datasets that are being used to train AI models today comprise data taken from the web. These datasets are collected by scraping massive amounts of publicly available data and also data that is available to the public under a wide variety of licenses. The legal conditions for using this raw data are not clear. This means machines are assembling petabytes of images with dubious provenance, not only because of questionable legal rights associated with the uses of these images, code and text, but also because of the often illicit content. Furthermore, we must acknowledge that this internet data has been produced by the wealthier segment of the world's population—the people with access to the internet and smartphones. This inherently skews the data. Find out more about this topic in the Deep Dive: AI podcast When hackers take on AI: Sci-fi – or the future?

Damage control

AI can do real damage. Deep fakes are a good example. A Deep Fake AI tool enables you to impose the face of someone over the body of someone else. They're popular tools in the movie industry, for example. Deep Fake tools are unfortunately used also for nefarious purposes, such as making it appear that someone is in a compromising situation, or to distribute malicious misinformation. Learn more about Deep Fakes in Deep Dive: AI podcast Building creative restrictions to curb AI abuse.

Another example is the stop button problem, where a machine trained to win a game can become so aware that it needs to win that it becomes resistant to being stopped. It sounds like science fiction, but it is an identified mathematical problem that research communities are aware of, and have no immediate solution for.

Hardware access

Currently, no real Open Source hardware stack for AI exists. Only an elite few have access to the hardware required for serious AI training and research. The volume of data consumed and generated by AI is measured in terabytes and petabytes, which means that special hardware is required to perform speedy computations on data sets of this size. Specifically, without graphic processing units (GPUs), an AI computation could take years instead of hours. Unfortunately, the hardware required to build and run these big AI models is proprietary, expensive, and requires special knowledge to set up. There are a limited number of organizations that have the resources to use and govern the technology.

Individual developers simply don't have the resources to purchase the hardware needed to run these data sets. A few vendors are beginning to release hardware with Open Source code, but the ecosystem is not mature. Learn more about the hardware requirements of AI in the Deep Dive: AI podcast Why Debian won’t distributed AI models anytime soon.

AI challenges

The Open Source Initiative protects open source against many threats today, but also anticipates the challenges, such as AI, of tomorrow. AI is a promising field, but it can also deliver disappointing results. Some AI guardrails are needed to protect creators, users, and the world at large.

The Open Source Initiative is actively encouraging dialogue. We need to understand the issues and implications and help communities establish shared principles that ensure AI is good for us all. Join the conversation by joining the four Deep Dive: AI panel discussions starting on October 11.

Join the conversation by joining the four Deep Dive: AI panel discussions starting on October 11.

Image by:

opensource.com

AI and machine learning Licensing What to read next Why the future of AI is open source This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Learn programming at Open Jam 2022

Fri, 10/07/2022 - 15:00
Learn programming at Open Jam 2022 Seth Kenlon Fri, 10/07/2022 - 03:00

Open Jam game jam is happening from October 28 to November 9 this year. Every year for the past several years, programmers from around the globe build open source video games, and then play and rate one another's games. Just for fun.

Open Jam is a "game jam," which is a casual way to inspire programmers of all skill levels to focus on a just-for-fun project for a concentrated period of time. It promotes open source games as well as open source game creation tools. While participants aren't required to use open source tools to create their game, the games themselves must bear an open license, and you literally get extra points during the scoring round for using open source tools.

Another important aspect of the jam, for me, is that it promotes alternate win conditions, but not in the way you might think.

Alternate win conditions

When you play a game, there's almost always something called a win condition. That's the thing that, when it happens, identifies the winner of the game. It could be that you capture the other player's King, or that you acquire 10 points before anyone else does, or that you have a better hand of cards than anyone else at the table.

Traditional games usually have a single win condition. When a specific thing happens, there's a single winner and the game is over. A modern trend in gaming, though, allows for multiple win conditions. Maybe you either secure 3 safehouses, or you find a cure for the virus that's started the zombie apocalypse. Maybe you find the buried treasure or seize the enemy pirate ship.

Image by:

(Astropippin, CC BY-SA 4.0)

What's always been true, though, is that each player can have a secret, private, personal win condition. This is hard for some people to understand, depending on how competitive you are, but not everyone plays a game to win. There are a lot of modern games that are just fun to play. Maybe the theme of the game is comforting or inspiring, making it fun to just spend time in the world. Or maybe the game makes you feel like you're actively solving a puzzle, and whether you solve that puzzle faster or more efficiently than somebody else doesn't satisfy you nearly as much as the act of searching for the solution. Or a game might have interesting mechanics you enjoy using in the most inventive ways you can think of.

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources

Open Jam is the same way. As a programmer, you're free to participate in Open Jam whether or not you're hoping to win first place, or any place at all for that matter. Your "win condition" for your Open Jam experience can be anything:

  • Learn a new programming language

  • Learn a new game engine

  • Actually finish a project

  • Make a fun game for your child or a younger sibling

  • Experiment with the concept of what a "game" can be

  • Write some code without over-thinking it

  • Give yourself a reason to order pizza for every meal while guiltlessly sitting at your computer for an entire weekend

Those are just a few ideas, and looking at past Open Jams reveal that a lot of programmers who participated probably had their own win conditions for the jam weekend. There are quirky games, there are short games, impossible games, esoteric games. There are adventure games and platformers and shooters. Games written on Godot, and with Python, and JavaScript, and Java. Some games you probably only need to play once, and other games you might keep going back to. There's a lot on offer from each Open Jam.

Join Open Jam 2022

If you're interested in programming but haven't been able to find the right project, or you've been afraid to start a big project, then writing a bite-sized game for Open Jam 2022 could be just the challenge you've been looking for. Be sure to bring a couple of your own win conditions, and make sure one of them is to "have fun."

Play to win, or play to play, in the game jam that promotes open source.

Image by:

Michael Clayton.

Gaming What to read next Learn Python by creating a video game This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Dynamically update TLS certificates in a Golang server without downtime

Thu, 10/06/2022 - 15:00
Dynamically update TLS certificates in a Golang server without downtime Savita Ashture Thu, 10/06/2022 - 03:00

Transport Layer Security (TLS) is a cryptographic protocol based on SSLv3 designed to encrypt and decrypt traffic between two sites. In other words, TLS ensures that you're visiting the site you meant to visit and prevents anyone between you and the website from seeing the data being passed back and forth. This is achieved through the mutual exchange of digital certificates: a private one that exists on the web server, and a public one typically distributed with web browsers.

In production environments, all servers run securely, but server certificates may expire after some period. It is then the server's responsibility to validate, regenerate, and reuse newly generated certificates without any downtime. In this article, I demonstrate how TLS certificates are updated dynamically using an HTTPS server in Go.

These are the prerequisites for following this tutorial:

  • A basic understanding of client-server working models
  • A basic knowledge of Golang
Configuring an HTTP server

Before I discuss how to update certs dynamically on an HTTPS server, I'll provide a simple HTTP server example. In this case, I need only the http.ListenAndServe function to start an HTTP server and http.HandleFunc to register a response handler for a particular endpoint.

To start, configure the HTTP server:

package main

import(
        "net/http"
        "fmt"
        "log"
)

func main() {

        mux := http.NewServeMux()
        mux.HandleFunc("/", func( res http.ResponseWriter, req *http.Request ) {
                fmt.Fprint( res, "Running HTTP Server!!" )
        })
       
        srv := &http.Server{
                Addr: fmt.Sprintf(":%d", 8080),
                Handler:           mux,
        }

        // run server on port "8080"
        log.Fatal(srv.ListenAndServe())
}

More for sysadmins Enable Sysadmin blog The Automated Enterprise: A guide to managing IT with automation eBook: Ansible automation for Sysadmins Tales from the field: A system administrator's guide to IT automation eBook: A guide to Kubernetes for SREs and sysadmins Latest sysadmin articles

In the example above, when I run the command go run server.go, it will start an HTTP server on port 8080. By visiting the http://localhost:8080 URL in your browser, you will be able to see a Hello World! message on the screen.

The srv.ListenAndServe() call uses Go's standard HTTP server configuration. However, you can customize a server using a Server structure type.

To start an HTTPS server, call the srv.ListenAndServeTLS(certFile, keyFile) method with some configuration, just like the srv.ListenAndServe() method. TheListenAndServe and ListenAndServeTLS methods are available on both the HTTP package and the Server structure.

The ListenAndServeTLS method is just like the ListenAndServe method, except it will start an HTTPS server.

func ListenAndServeTLS(certFile string, keyFile string) error

As you can see from the method signature above, the only difference between this method and the ListenAndServe method is the additional certFile and keyFile arguments. These are the paths to the SSL certificate file and private key file, respectively.

Generating a private key and an SSL certificate

Follow these steps to generate a root key and certificate:

1. Create the root key:

openssl genrsa -des3 -out rootCA.key 4096

2. Create and self-sign the root certificate:

openssl req -x509 -new -nodes -key rootCA.key -sha256 -days 1024 -out rootCA.crt

Next, follow these steps to generate a certificate (for each server):

1. Create the certificate key:

 openssl genrsa -out localhost.key 2048

2. Create the certificate-signing request (CSR). The CSR is where you specify the details for the certificate you want to generate. The owner of the root key will process this request to generate the certificate. When creating the CSR, it is important to specify the Common Name providing the IP address or domain name for the service, otherwise the certificate cannot be verified.

 openssl req -new -key localhost.key -out localhost.csr

3. Generate the certificate using the TSL CSR and key along with the CA Root key:

  openssl x509 -req -in localhost.csr -CA rootCA.crt -CAkey rootCA.key -CAcreateserial -out localhost.crt -days 500 -sha256

Finally, follow the same steps for generating certificates for each server to generate certificates for clients.

Configuring an HTTPS server

Now that you have both private key and certificate files, you can modify your earlier Go program and use the ListenAndServeTLS method instead.

package main

import(
        "net/http"
        "fmt"
        "log"
)

func main() {

        mux := http.NewServeMux()
        mux.HandleFunc("/", func( res http.ResponseWriter, req *http.Request ) {
                fmt.Fprint( res, "Running HTTPS Server!!" )
        })
       
        srv := &http.Server{
                Addr: fmt.Sprintf(":%d", 8443),
                Handler:           mux,
        }

        // run server on port "8443"
        log.Fatal(srv.ListenAndServeTLS("localhost.crt", "localhost.key"))
}

When you run the above program, it will start an HTTPS server using localhost.crt as the certFile and locahost.key as the keyFile from the current directory containing the program file. By visiting the https://localhost:8443 URL via browser or command-line interface (CLI), you can see the result below:

$ curl https://localhost:8443/ --cacert rootCA.crt --key client.key --cert client.crt

Running HTTPS Server!!

That's it! This is what most people have to do to launch an HTTPS server. Go manages the default behavior and functionality of the TLS communication.

Configuring an HTTPS server for automatically update certificates

While running the HTTPS server above, you passed certFile and keyFile to the ListenAndServeTLS function. However, if there is any change in certFile and keyFile because of certificate expiry, the server needs to be restarted to take those effects. To overcome that, you can use the TLSConfig of the net/http package.

The TLSConfig structure provided by the crypto/tls package configures the TLS parameters of the Server, including server certificates, among others. All fields of the TLSConfig structure are optional. Hence, assigning an empty struct value to the TLSConfig field is no different than assigning a nil value. However, the GetCertificate field can be very useful.

type Config struct {
   GetCertificate func(*ClientHelloInfo) (*Certificate, error)
}

The GetCertificate field of the TLSConfig structure returns a certificate based on the given ClientHelloInfo .

I need to implement GetCertificate closure function which uses tls.LoadX509KeyPair(certFile string, keyFile string) or tls.X509KeyPair(certFile []byte, keyFile []byte) function to get the certificates.

Now I'll create a Server struct that uses TLSConfig field value. Here is a sample format of the code:

package main

import (
        "crypto/tls"
        "fmt"
        "log"
        "net/http"
)

func main() {

        mux := http.NewServeMux()
        mux.HandleFunc("/", func( res http.ResponseWriter, req *http.Request ) {
                fmt.Fprint( res, "Running HTTPS Server!!\n" )
        })
       
        srv := &http.Server{
                Addr: fmt.Sprintf(":%d", 8443),
                Handler:           mux,
                TLSConfig: &tls.Config{
                        GetCertificate: func(*tls.ClientHelloInfo) (*tls.Certificate, error) {
                                               // Always get latest localhost.crt and localhost.key
                                               // ex: keeping certificates file somewhere in global location where created certificates updated and this closure function can refer that
                                cert, err := tls.LoadX509KeyPair("localhost.crt", "localhost.key")
                                if err != nil {
                                        return nil, err
                                }
                                return &cert, nil
                        },
                },
        }

        // run server on port "8443"
        log.Fatal(srv.ListenAndServeTLS("", ""))
}

In the above program, I implemented the GetCertificate closure function, which returns a cert object of type Certificate by using the LoadX509KeyPair function with the certificate and private files created earlier. It also returns an error, so handle it carefully.

Since I am preconfiguring server instance srv with a TLS configuration, I do not need to provide certFile and keyFile argument values to the srv.ListenAndServeTLS function call. When I run this server, it will work just like before. But this time, I have abstracted all the server configuration from the invoker.

$ curl https://localhost:8443/ --cacert rootCA.crt --key client.key --cert client.crt

Running HTTPS Server!!Final thoughts

A few additional tips by way of closing: To read the cert and key file inside the GetCertificate function, the HTTPS server should run as a goroutine. There is also a StackOverflow thread that provides other ways of configuration

Configuring an HTTPS server for automatically updated certificates is not as hard as you might think.

Image by:

Opensource.com

Sysadmin Security and privacy What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Use OCI containers to run WebAssembly workloads

Wed, 10/05/2022 - 15:00
Use OCI containers to run WebAssembly workloads Aditya R Wed, 10/05/2022 - 03:00

WebAssembly (also referred to as Wasm) has gained popularity as a portable binary instruction format with an embeddable and isolated execution environment for client and server applications. Think of WebAssembly as a small, fast, efficient, and very secure stack-based virtual machine designed to execute portable bytecode that doesn't care what CPU or operating system it runs on. WebAssembly was initially designed for web browsers to be a lightweight, fast, safe, and polyglot container for functions, but it's no longer limited to the web.

On the web, WebAssembly uses the existing APIs provided by browsers. WebAssembly System Interface (WASI) was created to fill the void between WebAssembly and systems running outside the browser. This enables non-browser systems to leverage the portability of WebAssembly, making WASI a good choice for portability while distributing and isolation while running the workload.

WebAssembly offers several advantages. Because it is platform neutral, one single binary can be compiled and executed on a variety of operating systems and architectures simultaneously, with a very low disk footprint and startup time. Useful security features include module signing and security knobs controllable at the run-time level rather than depending on the host operating system's user privilege. Sandboxed memory can still be managed by existing container tools infrastructure.

In this article, I will walk through a scenario for configuring container runtimes to run Wasm workloads from lightweight container images.

Adoption on cloud infrastructure and blockers

WebAssembly and WASI are fairly new, so the standards for running Wasm workloads natively on container ecosystems have not been set. This article presents only one solution, but there are other viable methods.

Some of the solutions include switching native Linux container runtimes with components that are Wasm compatible. For instance, Krustlet v1.0.0-alpha1 allows users to introduce Kubernetes nodes where Krustlet is used as a replacement for a standard kubelet. The limitation of this approach is that users have to choose between Linux container runtime and Wasm runtime.

Another solution is using a base image with Wasm runtime and manually invoking compiled binary. However, this method makes container images bloated with runtime, which is not necessarily needed if we invoke Wasm runtime natively at a lower level than container runtime.

I will describe how you can avoid this by creating a hybrid setup where existing Open Containers Initiative (OCI) runtimes can run both native Linux containers and WASI-compatible workloads.

Using crun in a hybrid setup of Wasm and Linux containers

Some of the problems discussed above can be easily addressed by allowing an existing OCI runtime to invoke both Linux containers and Wasm containers at a lower level. This avoids issues like depending on container images to carry Wasm runtime or introducing a new layer to infrastructure that supports only Wasm containers.

One container runtime that can handle the task: crun.

Crun is fast, has a low-memory footprint, and is a fully OCI-compliant container runtime that can be used as a drop-in replacement for your existing container runtime. Crun was originally written to run Linux containers, but it also offers handlers capable of running arbitrary extensions inside the container sandbox in a native manner.

This is an informal way of replacing existing runtime with crun just to showcase that crun is a complete replacement for your existing OCI runtime.

$ mv /path/to/exisiting-runtime /path/to/existing-runtime.backup
$ cp /path/to/crun /path/to/existing-runtime

One such handler is crun-wasm-handler, which delegates specially configured container images (a Wasm compat image) to the parts of existing Wasm runtimes in a native approach inside the crun sandbox. This way, end users do not need to maintain Wasm runtimes by themselves.

Crun has native integration with wasmedge, wasmtime, and wasmer to support this functionality out of the box. It dynamically invokes parts of these runtimes as crun detects whether the configured image contains any Wasm/WASI workload, and it does so while still supporting the native Linux containers.

For details on building crun with Wasm/WASI support, see the crun repository on GitHub.

Building and running Wasm images using Buildah on Podman and Kubernetes

Users can create and run platform-agnostic Wasm images on Podman and Kubernetes using crun as an OCI runtime under the hood. Here's a tutorial:

Creating Wasm compat images using Buildah

Wasm/WASI compatible images are special. They contain a magic annotation that helps an OCI runtime like crun classify whether it is a Linux-native image or an image with a Wasm/WASI workload. Then it can invoke handlers if needed.

Creating these Wasm compat images is extremely easy with any container image build tools, but for this article, I will l demonstrate using Buildah.

1. Compile your .wasm module.

2. Prepare a Containerfile with your .wasm module.

FROM scratch

COPY hello.wasm /

CMD ["/hello.wasm"]

3. Build a Wasm image using Buildah with annotation module.wasm.image/variant=compat

$ buildah build --annotation "module.wasm.image/variant=compat" -t mywasm-image

Once the image is built and the container engine is configured to use crun, crun will automagically do the needful and run the provided workload by the configured Wasm handler.

More on Kubernetes What is Kubernetes? Free online course: Containers, Kubernetes and Red Hat OpenShift technical over… eBook: Storage Patterns for Kubernetes Test drive OpenShift hands-on An introduction to enterprise Kubernetes How to explain Kubernetes in plain terms eBook: Running Kubernetes on your Raspberry Pi homelab Kubernetes cheat sheet eBook: A guide to Kubernetes for SREs and sysadmins Latest Kubernetes articles Running a WASM workload with Podman

Crun is the default OCI runtime for Podman. Podman contains knobs and handles to utilize most crun features, including the crun Wasm handler. Once a Wasm compat image is built, it can be used by Podman just like any other container image:

$ podman run mywasm-image:latest

Podman runs the requested Wasm compat image mywasm-image:latest using crun's Wasm handler and returns output confirming that our workload was executed.

$ hello world from the webassembly module !!!!Kubernetes-supported and tested container run-time interface (CRI) implementations

Here's how to configure two popular container runtimes:

CRI-O
  • Configure CRI-O to use crun instead of runc by editing config at /etc/crio/crio.conf. Red Hat OpenShift documentation contains more details about configuring CRI-O.
  • Restart CRI-O with sudo systemctl restart crio.
  • CRI-O automatically propagates pod annotations to the container spec.
Containerd
  • Containerd supports switching container runtime via a custom configuration defined at /etc/containerd/config.toml.
  • Configure containerd to use crun by making sure the run-time binary points to crun. More details are available in the containerd documentation.
  • Configure containerd to allowlist Wasm annotations so they can be propagated to the OCI spec by setting pod_annotations in the configuration: pod_annotations = ["module.wasm.image/variant.*"].
  • Restart containerd with sudo systemctl start containerd.
  • Now containerd should propagate Wasm pod annotations to containers.

The following is an example of a Kubernetes pod spec that works with both CRI-O and containerd:

apiVersion: v1
kind: Pod
metadata:
  name: pod-with-wasm-workload
  namespace: mynamespace
  annotations:
    module.wasm.image/variant: compat
spec:
  containers:
  - name: wasm-container
    image: myrepo/mywasmimage:latestKnown issues and workarounds

Complex Kubernetes infrastructure contains pods and, in many cases, pods with sidecars. That means crun's Wasm integration is not useful when a deployment contains sidecars and sidecar containers do not contain a Wasm entry point, such as infrastructure setups with service mesh like Linkerd, Gloo, and Istio or a proxy like Envoy.

You can solve this issue by adding two smart annotations for Wasm handlers: compat-smart and wasm-smart. These annotations serve as a smart switch that only toggles Wasm runtime if it's necessary for a container. Hence while running deployments with sidecars, only containers that contain valid Wasm workloads are executed by Wasm handlers. Regular containers are treated as usual and delegated to the native Linux container runtime.

Thus when building images for such a use case, use annotation module.wasm.image/variant=compat-smart instead of module.wasm.image/variant=compat.

You can find other known issues in crun documentation on GitHub.

Run Wasm/WASI workloads natively on Podman and Kubernetes using crun.

Containers Kubernetes Linux What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. 31 points Italy | Follow gscrivano | Connect gscrivano Open Enthusiast Author Register or Login to post a comment.

Learn the OSI model in 5 minutes

Tue, 10/04/2022 - 15:00
Learn the OSI model in 5 minutes Anamika Tue, 10/04/2022 - 03:00

The Open Systems Interconnection (OSI) model is a standard for how computers, servers, and people communicate within a system. It was the first standard model for network communications and was adopted in the early 1980s by all major computer and telecommunications companies.

The OSI model provides a universal language for describing networks and thinking about them in discrete chunks, or layers.

Layers of the OSI model

The model describes the seven layers through which computer systems communicate over a network.

  1. Application layer
  2. Presentation layer
  3. Session layer
  4. Transport layer
  5. Network layer
  6. Data link layer
  7. Physical layer

Each of these layers has its own way of working, with its own set of protocols that distinguish it from the others. This article provides a breakdown of the layers one by one.

Application layer

The application layer is implemented in software. It is the layer used to interact with applications.

Consider the example of sending a message. The sender will interact with the application layer and send the message. The application layer sends the message to the next layer in the OSI Model, the presentation layer.

Presentation layer

The data from the application layer is forwarded to the presentation layer. The presentation layer receives the data in the form of words, characters, letters, numbers, and so on, and converts them into machine representable binary format. This process is known as translation.

At this stage, ASCII characters (American Standard Code for Information Interchange) are converted into Extended Binary Coded Decimal Interchange Code (EBCDIC). Before the converted data goes further, it also undergoes encoding and encryption processes, using the SSL protocol for encryption and decryption.

The presentation layer provides abstraction and assumes that the layers following it will take care of the data forwarded to them from this layer. It also plays a role in compression of the data. The compression can be lossy or lossless, depending on various factors beyond this article's scope.

Session layer

The session layer helps in setting up and managing connections. The main work of this layer is to establish a session. For example, on an online shopping site, a session is created between your computer and the site's server.

The session layer enables the sending and receiving of data, followed by the termination of connected sessions. Authentication is done before a session is established, followed by authorization. Like the previous layers, the session layer also assumes that, after its work is done, the data will be correctly handled by the subsequent layers.

More on edge computing Understanding edge computing Why Linux is critical to edge computing eBook: Running Kubernetes on your Raspberry Pi Download now: The automated enterprise eBook eBook: A practical guide to home automation using open source tools eBook: 7 examples of automation on the edge The latest on edge Transport layer

The transport layer manages data transportation and its own set of protocols for how data will be transferred. The data received here from the session layer is divided into smaller data units called segments. This process is known as segmentation. Every segment contains the source's and destination's port numbers and a sequence number. Port numbers identify the application on which the data needs to be sent. Note that the data is transferred in chunks. The sequence numbers are used to reassemble the segments in the correct order.

The transport layer takes care of the flow control, or the amount of data transferred at a given time. It also accounts for error control, such as data loss, data corruption, and so on. It makes use of an error-detecting value known as a checksum. The transport layer adds a checksum to every data segment to check whether the sent data is received correctly. Data is then transferred to the network layer.

Network layer

The network layer helps communicate with other networks. It works to transmit received data segments from one computer to another located in a different network. The router lives in the network layer.

The function of the network layer is logical addressing (IP Addressing). It assigns the sender's and receiver's IP addresses to each data packet to ensure it is received at the correct destination. The network layer then routes the data packets. Load balancing also happens in the network layer to make sure that no overloading takes place. Next, the data is transported to the data link layer.

Data link layer

The data link layer allows direct communication with other devices, such as computers and hosts.

It receives data packets containing the IP addresses of the sender and receiver from the network layer and does the physical addressing, assigning the media access control (MAC) addresses of the sender and receiver to a data packet to form a frame.

Physical layer

This layer consists of all the hardware and mechanical elements of a system, including the configuration of wires, pins, adapters, and so forth. The data received here by the preceding layers is in the form of 0s and 1s. The physical layer converts this data and transports it to local media via various means, including wires, electrical signals, light signals (as in optical fiber cables), and radio signals (as in WiFi).

Note that the physical layer works at the receiver's end and transports the received signal to the data link as a frame (by converting it back to bits). The frame is moved to the higher layers, and ultimately the required data is received at the application layer, which is the software.

Conclusion

The OSI model is helpful when you need to describe network architecture or troubleshoot network problems. I hope this article gave you a clearer understanding of the elements this model. 

Get the basics of the Open Systems Interconnection (OSI) framework for conceptualizing communication within a computer system.

Image by:

Opensource.com

Sysadmin Networking Edge computing What to read next How Linux came to the mainframe This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

5 new improvements in Apache ShardingSphere

Tue, 10/04/2022 - 15:00
5 new improvements in Apache ShardingSphere Yacine Si Tayeb, PhD Tue, 10/04/2022 - 03:00

Apache ShardingSphere, a powerful distributed database, recently released a major update to optimize and enhance its features, performance, testing, documentation, and examples. In short, the project continues to work hard at development to make it easier for you to manage your organization's data.

1. SQL audit for data sharding

The problem: When a user executes an SQL query without the sharding feature in large-scale data sharding scenarios, the SQL query is routed to the underlying database for execution. As a result, many database connections are occupied, and businesses are severely affected by a timeout or other issues. Worse still, should the user perform an UPDATE/DELETE operation, a large amount of data may be incorrectly updated or deleted.

ShardingSphere's solution: As of version 5.2.0, ShardingSphere provides the SQL audit for data sharding feature and allows users to configure audit strategies. The strategy specifies multiple audit algorithms, and users can decide whether audit rules should be disabled. SQL execution is strictly prohibited if any audit algorithm fails to pass.

Here's the configuration of an SQL audit for data sharding:

rules:
- !SHARDING
  tables:
    t_order:
      actualDataNodes: ds_${0..1}.t_order_${0..1}
      tableStrategy:
        standard:
          shardingColumn: order_id
          shardingAlgorithmName: t_order_inline
      auditStrategy:
        auditorNames:
          - sharding_key_required_auditor
        allowHintDisable: true
  defaultAuditStrategy:
    auditorNames:
      - sharding_key_required_auditor
    allowHintDisable: true  auditors:
    sharding_key_required_auditor:
      type: DML_SHARDING_CONDITIONS

Given complex business scenarios, this new feature allows you to dynamically disable the audit algorithm by using SQL hints so that partial business SQL operations can be executed.

ShardingSphere has a built-in DML disables full-route audit algorithm. You can also implement a ShardingAuditAlgorithm interface to gain advanced SQL audit functions:

/* ShardingSphere hint: disableAuditNames=sharding_key_required_auditor */ SELECT * FROM t_order;2. SQL execution process management

The ShardingSphere MySQL database provides a SHOW PROCESSLIST statement, allowing you to view the currently running thread. You can kill the thread with the KILL statement for SQL that takes too long to be temporarily terminated.

Image by:

(Duan Zhengqiang, CC BY-SA 4.0)

The SHOW PROCESSLIST and KILL statements are widely used in daily operation and maintenance management. To enhance your ability to manage ShardingSphere, version 5.2.0 supports the MySQL SHOW PROCESSLIST and KILL statements. When you execute a DDL/DML statement through ShardingSphere, ShardingSphere automatically generates a unique UUID identifier and stores the SQL execution information in each instance.

When you execute the SHOW PROCESSLIST statement, ShardingSphere processes the SQL execution information based on the current operating mode.

If the current mode is cluster mode, ShardingSphere collects and synchronizes the SQL execution information of each compute node through the governance center and then returns the summary to the user. If the current mode is the standalone mode, ShardingSphere only returns SQL execution information in the current compute node.

You get to determine whether to execute the KILL statement based on the result returned by SHOW PROCESSLIST, and ShardingSphere cancels the SQL in execution based on the ID in the KILL statement.

More on data science What is data science? What is Python? How to become a data scientist Data scientist: A day in the life What is big data? Whitepaper: Data-intensive intelligent applications in a hybrid cloud blueprint MariaDB and MySQL cheat sheet Latest data science articles 3. Shardingsphere-on-cloud

Shardingsphere-on-cloud is a project of Apache ShardingSphere providing cloud-oriented solutions. Version 0.1.0 has been released, and it has been officially voted as a sub-project of Apache ShardingSphere.

Shardinsphere-on-cloud will continue releasing configuration templates, deployment scripts, and other automation tools for ShardingSphere on the cloud.

It will also polish the engineering practices in terms of high availability, data migration, observability, shadow DB, security, and audit, optimize the delivery mode of Helm Charts, and continue to enhance its cloud-native management capabilities through Kubernetes Operator. There are already introductory issues in the project repository to help those interested in getting Go, Database, and Cloud up and running quickly.

4. Access port

In version 5.2.0, ShardingSphere-Proxy can monitor specified IP addresses and integrate openGauss database drivers by default. ShardingSphere-JDBC supports c3p0 data sources, and a Connection.prepareStatement can specify the columns.

5. Distributed transaction

The original logical database-level transaction manager has been adjusted to a global manager, supporting distributed transactions across multiple logical databases. XA transactions are now automatically managed by ShardingSphere, which removes the XA statement's ability to control distributed transactions.

Use ShardingSphere for distributed data

ShardingSphere supports multiple databases, and each improvement takes more work off your hands. The establishment of the shardingsphere-on-cloud sub-project shows ShardingSphere's commitment to being cloud-native. The greater ShardingSphere community welcomes anyone interested in Go, databases, and the cloud to join the shardingsphere-on-cloud sub-project!

The article was first published on Medium.com and has been republished with permission.

ShardingSphere supports multiple databases, and each improvement in the 5.2.0 release takes more work off your hands.

Image by:

Opensource.com

Databases What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Is your old computer 'obsolete', or is it a Linux opportunity?

Mon, 10/03/2022 - 15:00
Is your old computer 'obsolete', or is it a Linux opportunity? Phil Shapiro Mon, 10/03/2022 - 03:00

You may often hear someone claim a computer, tablet, or smartphone is "obsolete." When you hear such a statement, take a minute to ask yourself, "Is this person speaking an opinion or a fact?"

Many times, their statement is an opinion. Let me explain why.

When someone declares a computer "obsolete," they often speak from their own point of view. So, if you're a technology professional, a five-year-old computer might indeed be obsolete. But is that same five-year-old computer obsolete to a refugee family fleeing war or famine? Probably not. A computer that is obsolete for you might be a dream computer for someone else.

How I refurbish old computers with Linux

I have some experience in this field. For the past 25 years, I've been taking older computers to people who don't have them. One of my second grade students, raised by her grandmother, graduated from Stanford University five years ago. Another one of my students, to whom I delivered a dusty Windows XP desktop in 2007, graduated from Yale University last year. Both of these students used donated computers for their own self-advancement. The latter student typed more than 50 words per minute before reaching middle school. Her family could not afford Internet service when I delivered her donated computer to her–in third grade. So, she used her time productively to learn touch typing skills. I document her story in this YouTube video.

I'll share another anecdote that is difficult to believe, even for me. A few years ago, I bought a Dell laptop on eBay for $20. This laptop was a top-of-the-line laptop in 2002. I installed Linux Mint on it, added a USB WiFi adapter, and this laptop was reborn. I documented this story in a YouTube video titled, "My $20 eBay laptop."

In the video, you can see this laptop surfing the web. It's not speedy but is much faster than the dial-up computers we used in the late 1990s. I would describe it as functional. Someone could write their doctoral thesis using this 2002 laptop. The thesis would read as well as if it were written using a computer released yesterday. This laptop should be set up somewhere public where people can see up close that a 2002 computer can still be usable. Seeing is believing. Ain't that the truth?

More Linux resources Linux commands cheat sheet Advanced Linux commands cheat sheet Free online course: RHEL technical overview Linux networking cheat sheet SELinux cheat sheet Linux common commands cheat sheet What are Linux containers? Our latest Linux articles

How about those famed "netbooks" from 2008, 2009, and 2010? Surely those are obsolete, right? Not so fast! If you install a 32-bit Linux on them, they can surf the web just fine using the latest version of Chromium web browser–which still supports 32-bit operating systems. (Google Chrome no longer supports 32-bit operating systems, though.) A student with one of these netbooks could watch Khan Academy videos and develop their writing skills using Google Docs. Hook up one of these netbooks to a larger LCD screen, and the student could develop skills with LibreOffice Draw or Inkscape, two of my favorite open source graphics programs. If you're interested, I have a video for reviving netbooks using Linux. Netbooks are also ideal for mailing overseas to a school in Liberia, a hospital in Haiti, a food distribution site in Somalia, or anywhere else where donated technology could make a huge difference.

Do you know where refurbished netbooks would really be welcome? In the communities that have opened their hearts and homes to Ukrainian refugees. They're doing their part, and we ought to be doing ours.

Open source revives old computers

Many technology professionals live in a bubble of privilege. When they declare a technology "obsolete," they might not realize the harm they are causing by representing this opinion as a fact. People unfamiliar with how open source can revive older computers are sentencing those computers to death. I won't stand idly by when that happens. And you should not, either.

A simple response to a person who declares a computer obsolete is to say, "Sometimes older computers can be revived and put back to use. I've heard that open source is one way of doing that."

If you know the person well, you might want to share links to some of the YouTube videos listed in this article. And when you get a chance, take a moment to meet an individual or family who lacks access to the technology they need. That meeting will enrich your life in ways you would have never imagined.

Too often older computers are labeled 'obsolete'. Linux changes that. Refurbish an old computer and make it useful again for someone who needs it.

Image by:

Opensource.com

Hardware Linux What to read next How to make an old computer useful again Give an old MacBook new life with Linux Student Linux club refurbishes computers to support distance learning This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

What's new with Awk?

Sat, 10/01/2022 - 15:00
What's new with Awk? Jim Hall Sat, 10/01/2022 - 03:00

Awk is a powerful scripting tool that makes it easy to process text. Awk scripts use a pattern-action syntax, where Awk performs an action for every line in a file that matches a pattern. This provides a flexible yet powerful scripting language to deal with text. For example, the one-line Awk script /error/ {print $1, $2, $3} will print the first three space-delimited fields for any line that contains the word error.

While we also have the GNU variant of Awk, called Gawk, the original Awk remains under development. Recently, Brian Kernighan started a project to add Unicode support to Awk. I met with Brian to ask about the origins of Awk and his recent development work on Awk.

Jim Hall: Awk is a great tool to parse and process text. How did it start?

Brian Kernighan: The most direct influence was a tool that Marc Rochkind developed while working on the Programmer's Workbench system at Bell Labs. As I remember it now, Marc's program took a list of regular expressions and created a C program that would read an input file. Whenever the program found a match for one of the regular expressions, it printed the matching line. It was designed for creating error checking to run over log files from telephone operations data. It was such a neat idea—Awk is just a generalization.

Jim: AWK stands for the three of you who created it: Al Aho, Peter Weinberger, and Brian Kernighan. How did the three of you design and create Awk?

Brian: Al was interested in regular expressions and had recently implemented egrep, which provided a very efficient lazy-evaluation technique for a much bigger class of regular expressions than what grep provided. That gave us a syntax and working code.

Peter had been interested in databases, and as part of that he had some interest in report generation, like the RPG language that IBM provided. And I had been trying to figure out some kind of editing system that made it possible to handle strings and numbers with more or less equal ease.

We explored designs, but not for a long time. I think Al may have provided the basic pattern-action paradigm, but that was implicit in a variety of existing tools, like grep, the stream editor sed, and in the language tools YACC and Lex that we used for implementation. Naturally, the action language had to be C-like.

Jim: How was Awk first used at Bell Labs? When was Awk first adopted into Unix?

Brian: Awk was created in 1977, so it was part of 7th-edition Unix, which I think appeared in about 1979. I wouldn't say it was adopted, so much as it was just another program included because it was there. People picked it up very quickly, and we soon had users all over the Labs. People wrote much bigger programs than we had ever anticipated, too, even tens of thousands of lines, which was amazing. But for some kinds of applications, the language was a good match.

Jim: Has Awk changed over the years, or is Awk today more or less the same Awk from 1977?

Brian: Overall, it's been pretty stable, but there have been a fair number of small things, mostly to keep up with at least the core parts of Gawk. Examples include things like functions to do case conversion, shorthands for some kinds of regular expressions, or special filenames like /dev/stderr. Internally, there's been a lot of work to replace fixed-size arrays with arrays that grow. Arnold Robbins, who maintains Gawk, has also been incredibly helpful with Awk, providing good advice, testing, code, and help with Git.

Jim: You're currently adding Unicode support to Awk. This is one of those projects that seems obvious when you hear it, because Unicode is everywhere, but not every program supports it yet. Tell us about your project to add Unicode to Awk.

Brian: It's been sort of embarrassing for a while now that Awk only handled 8-bit input, though in fairness it predates Unicode by 10 or 20 years. Gawk, the GNU version, has handled Unicode properly for quite a while, so it's good to be up to date and compatible.

More Linux resources Linux commands cheat sheet Advanced Linux commands cheat sheet Free online course: RHEL technical overview Linux networking cheat sheet SELinux cheat sheet Linux common commands cheat sheet What are Linux containers? Our latest Linux articles

Jim: How big of a project is adding Unicode support? Did this require many changes to the source code?

Brian: I haven't counted, but it's probably 200 or 300 lines, primarily concentrated in either the regular expression recognizer or in the various built-in functions that have to operate in characters, not bytes, for Unicode input.

Jim: How far along are you in adding Unicode to Awk?

Brian: There's a branch of the code at GitHub that's pretty up to date. It's been tested, but there's always room for more testing.

One thing to mention: It handles UTF-8 input and output, but for Unicode code points, which are not the same thing as Unicode graphemes. This distinction is important but technically very complicated, at least as I understand it. As a simple example, a letter with an accent could be represented as two code points (letter and accent) or as a single character (grapheme). Doing this right, whatever that means, is very hard.

Jim: In a Computerphile video, you mention adding support for comma-separated values (CSV) parsing to Awk. How is that project going?

Brian: While I had my hands in the code again, I did add support for CSV input, since that's another bit of the language that was always clunky. I haven't done anything for CSV output, since that's easy to do with a couple of short functions, but maybe that should be revisited.

Jim: What kinds of things do you use Awk for in your day-to-day work?

Brian: Everything. Pretty much anything that fiddles text is a target for Awk. Certainly, the Awk program I use most is a simple one to make all lines in a text document the same length. I probably used it 100 times while writing answers to your questions.

Jim: What's the coolest (or most unusual) thing you have used Awk to do?

Brian: A long time ago, I wrote a C++ program that converted Awk programs into C++ that looked as close to Awk as I could manage, by doing things like overloading brackets for associative arrays. It was never used, but it was a fun exercise.

Further reading

Brian Kernighan discusses the scripting tool Awk, from its creation to current work on Unicode support.

Image by:

Jonas Leupe on Unsplash

Programming What to read next Talking digital with Brian Kernighan Learn awk by coding a "guess the number" game This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How I dock my Linux laptop

Fri, 09/30/2022 - 15:00
How I dock my Linux laptop Don Watkins Fri, 09/30/2022 - 03:00

Not that long ago, docking a laptop was a new idea to me. Since then, I've set up two different laptops for docking, and I've been more than satisfied with how this configuration works. This article describes how I set up the laptop docking stations and gives tips and guidance for anyone considering the same.

How I discovered docking

I began 2020 with one laptop: a Darter Pro purchased from System76 the year before. It had been more than adequate, and I enjoyed the freedom of movement that toting a laptop gave me. Then, of course, everything changed.

Suddenly, I was at home all the time. Instead of an occasional video conference, remote meetings and get-togethers became the norm. I grew tired of the little square images of colleagues on the Darter Pro's 15.6-inch display, and the included 256 GB Non-Volatile Memory express (NVMe) drive was a limited amount of storage for my needs.

I purchased an Intel NUC (Next Unit of Computing) kit and a 27-inch LCD display, rearranged my office with a table, and settled into this new paradigm. Later in the year, I built another, more powerful NUC with an i7 processor, 32 GB of RAM, and a terabyte NVMe drive to accommodate video games and the video converted from old family 8mm movies and VHS tapes.

Now I had a laptop and a desktop (both running Linux distributions, of course). I regularly met with colleagues and friends on a variety of video conferencing platforms, and I thoroughly enjoyed the increased desktop real estate afforded by the 27-inch display.

Fast-forward 18 months to summer 2022, when I discovered laptop docking stations.

More Linux resources Linux commands cheat sheet Advanced Linux commands cheat sheet Free online course: RHEL technical overview Linux networking cheat sheet SELinux cheat sheet Linux common commands cheat sheet What are Linux containers? Our latest Linux articles Docking a Linux laptop

I didn't realize I could have docked my Darter Pro using its USB-C port. My appetite for new hardware had me pining for the new HP Dev One, which comes with Pop!_OS preinstalled, a Ryzen 7 processor, 16 GB of RAM, a terabyte NVMe drive, and two USB-C ports.

Opensource.com alumnus Jay LeCroix has an excellent video explaining how to use USB-C docking with Linux laptops. That's when I began to appreciate the utility of USB-C docking hardware. After watching the video and doing some additional research, I decided to purchase a USB-C docking station for the Darter Pro.

System76 has a list of community-recommended docks, including the Plugable UD-CA1A dock. The Plugable dock claims to work with all Intel and NVIDIA systems, so I decided to order a unit. The dock was easy to set up, and once connected to the Darter Pro, I had the best of both worlds. I had the freedom to move when I wanted to and the ability to dock if I wanted to. But I needed more storage space than the 256GB drive in the Darter Pro.

My success with the docking station led me to purchase the HP Dev ONE, eager for the increased storage and speed. When it arrived, I quickly set it up. I backed up all the files on the NUC and restored them on my new Dev ONE. The new laptop was easily connected to the docking station.

Docking the Dev ONE

The Dev ONE has two USB-C ports, and either one connects to the dock. It still amazes me that power and video pass through this single port. In addition to the two USB-A ports on my laptop, I now have the additional ports on the docking station. The Plugable dock comes with a headphone jack, microphone jack, 4K high-definition multimedia interface (HDMI) output, one USB-C port on the rear, three USB-A 3.0 ports in the front, two USB-A 2.0 ports on the back, and a Gigabit Ethernet port. My new laptop has a webcam, but I plug an external webcam into one of the rear USB ports to accommodate video conferencing when I'm docked.

I enjoy the flexibility and power that docking has added to my workflow. I use video conferencing daily, and when I do, I'm docked and connected to a 27-inch display. I love how easy it is to transition from one setup to another. If you're using a laptop with a USB-C port, I recommend looking into a good docking station. It's well worth it.

Or are you already using a dock with your Linux laptop? What are your USB-C docking station recommendations? Be sure to share your experience in the comments below.

Docking a Linux laptop offers flexibility and a better video experience.

Image by:

Opensource.com

Linux Hardware What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

How Podman packaging works on Linux

Thu, 09/29/2022 - 15:00
How Podman packaging works on Linux Lokesh Mandvekar Thu, 09/29/2022 - 03:00

Over the past few months, the Podman project has been reworking its process for generating Debian and Ubuntu packages. This article outlines the past and present of the Debian packaging work done by the Podman project team. Please note that this article does not refer to the official Debian and Ubuntu packages that Reinhard Tartler and team created and maintain.

Debian build process

Long story short, the typical Debian build process involves "Debianizing" an upstream repository. First, a debian subdirectory containing packaging metadata and any necessary patches is added to the upstream repo. Then the dpkg-buildpackage command is run to generate the .deb packages.

Older Debian build process for Podman

Previously, the Debian packages for Podman were generated using this "Debianization" process. A debian directory containing the packaging metadata was added to the Podman source in a separate fork. That fork got rebased for each new upstream Podman release.

More Linux resources Linux commands cheat sheet Advanced Linux commands cheat sheet Free online course: RHEL technical overview Linux networking cheat sheet SELinux cheat sheet Linux common commands cheat sheet What are Linux containers? Our latest Linux articles Issues with the Debian build process (for an RPM packager)

While a simple rebase would often work, that was not always the case. Usually, the Podman source itself would require patching to make things work for multiple Debian and Ubuntu versions, leading to rebase failures. And failures in a rebase meant failures in automated tasks. Which, in turn, caused a lot of frustration.

This same frustration led our team to retire the Debian packages in the past. When Podman v3.4 officially made its way into Debian 11 and Ubuntu 22.04 LTS (thanks to the amazing Reinhard Tartler), we thought the Podman project could say goodbye to Debian package maintenance.

But that wasn't meant to be. Both Debian and Ubuntu are rather conservative in their package update policies, especially in their release and LTS versions. As a result, many Podman users on Debian-based distributions would be stuck with v3.4 for quite a while, perhaps the entire lifetime of the distro version. While users can often install the latest packages from Debian's experimental repository, that's not necessarily convenient for everyone. As a result, many Debian-based users asked the Podman project for newer packages.

If we were to resurrect the Podman project's own Debian packages, we needed the packaging format to be easy to maintain and debug for RPM packagers and also easy to automate, which meant no frequent failures with rebases and patches.

OBS + Debbuild

The debbuild tool, created by Neal Gompa and others, is a set of RPM packaging macros allowing packagers to build Debian packages using Fedora's packaging sources. Conveniently, debbuild packages can easily be added as dependencies to a project hosted on openSUSE's Open Build Service infrastructure.

Here's a snippet of how debbuild support is enabled for Ubuntu 22.04 on the OBS Stable Kubic repository, maintained by the Podman project:

 
   
   
    x86_64
    s390x
    armv7l
    aarch64
 

The complete configuration file is available here.

In addition to enabling debbuild packages as dependencies, the Fedora packaging sources must be updated with rules to modify the build process for Debian and Ubuntu environments.

Here's a snippet of how it's done for Podman:

%if "%{_vendor}" == "debbuild"
Packager: Podman Debbuild Maintainers
License: ASL-2.0+ and BSD and ISC and MIT and MPLv2.0
Release: 0%{?dist}
%else
License: ASL 2.0 and BSD and ISC and MIT and MPLv2.0
Release: %autorelease
ExclusiveArch: %{golang_arches}
%endif

The " %{_vendor}" == "debbuild" conditional is used in many other places throughout the spec file. For example, in this code sample, it specifies different sets of dependencies and build steps for Fedora and Debian. Also, debbuild allows conditionalizing Debian and Ubuntu versions using the macros {debian} and {ubuntu}, which are familiar to RPM packagers.

You can find the updated RPM spec file with all the debbuild changes here.

These two pieces together produce successful Debian package builds on the OBS Unstable Kubic repository.

Using debbuild also ensures that packaging metadata lives in its own separate repository, implying no patching or rebasing hassles with upstream Podman sources.

Usability

At this time, packages are available for Ubuntu 22.04, Debian Testing, and Debian Unstable. We're in talks with the OBS infrastructure maintainers to adjust the Debian 11 and Ubuntu 20.04 build environments, after which we'll also have successful builds for those two environments.

$ apt list podman
Listing... Done
podman/unknown,now 4:4.2.0-0ubuntu22.04+obs55.1 amd64 [installed]
podman/unknown 4:4.2.0-0ubuntu22.04+obs55.1 arm64
podman/unknown 4:4.2.0-0ubuntu22.04+obs54.1 armhf
podman/unknown 4:4.2.0-0ubuntu22.04+obs54.1 s390x

Now, let's talk usability. These packages have been manually verified, and the Podman team has found them to satisfy typical use cases. Users can install these packages as they would any other DEB package. The repository first needs to be enabled, and there are instructions on the Podman website. However, these packages are not Debian-approved. They haven't gone through the same quality assurance process as official Debian packages. These packages are currently not recommended for production use, and we urge you to exercise caution before proceeding with installation.

Wrap up

The debbuild project allows the Podman project team to generate Debian packages with a few additions to the Fedora packaging sources, making Debian packaging easier to maintain, debug, and automate. It also allows Debian and Ubuntu users to get the latest Podman at the same speed as Fedora users. Podman users on Debian and Ubuntu looking for the latest updates can use our Kubic unstable repository (ideally not on production environments just yet.)

We also highly recommend the debbuild and OBS setup to RPM packagers who must provide Debian and Ubuntu packages to their users. It's a diverse selection of tooling, but open source is all about working together.

Get a deep dive into Podman packages for Debian and Ubuntu using Fedora Sources, OBS, and Debbuild.

Image by:

Opensource.com

Linux Kubernetes What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Build an open source project using this essential advice

Wed, 09/28/2022 - 15:00
Build an open source project using this essential advice Bolaji Ayodeji Wed, 09/28/2022 - 03:00

Open source is a flourishing and beneficial ecosystem that publicly solves problems in communities and industries using software developed through a decentralized model and community contributions. Over the years, this ecosystem has grown in number and strength among hobbyists and professionals alike. It's mainstream now—even proprietary companies use open source to build software.

With the ecosystem booming, many developers want to get in and build new open source projects. The question is: How do you achieve that successfully?

This article will demystify the lifecycle and structure of open source projects. I want to give you an overview of what goes on inside an open source project and show you how to build a successful and sustainable project based on my personal experience.

A quick introduction to open source

The Open Source Initiative (OSI) provides a formal, detailed definition of open source, but Wikipedia provides a nice summary:

Open source software is computer software that is released under a license in which the copyright holder grants users the rights to use, study, change, and distribute the software and its source code to anyone and for any purpose.

Open source software is public code, usually on the internet, developed either collaboratively by multiple people or by one person. It's about collaborating with people from different regions, cultures, and technical backgrounds, often working remotely. This is why creating a project that welcomes everyone and enables different people to work together is essential.

The anatomy of an open source project

Like the human body, an open source project is made up of several structures that form the entire system. I think of them as two branches: the people (microscopic) and the documents (macroscopic).

Branch one: people

Generally, an open source project includes the following sets of people:

  • Creators: Those who created the project
  • Maintainers: Those who actively manage the entire project
  • Contributors: Those who contribute to the project (someone like you!)
  • User: Those who use the project, including developers and nontechnical customers
  • Working group: A collection of contributors split into domain-specific groups to focus on a discussion or activity around a specific subject area (such as documentation, onboarding, testing, DevOps, code reviews, performance, research, and so on)
  • Sponsor: Those who contribute financial support to the project

You need to consider each group in the list above as you prepare to build a new project. What plan do you have for each of them?

  • For maintainers, decide on the criteria you want to use to appoint them. Usually, an active contributor makes the best maintainer.
  • For users and contributors, you want to prepare solid documentation, an onboarding process, and everything else they need to succeed when working with your project.
  • For working groups, decide whether you need them and how your project may be logically split in the future.
  • Finally, for sponsors, you must provide enough data and information about your project to enable them to choose to sponsor you.

You don't need to have all of these figured out at the start of your project. However, it's wise to think about them at the early stages so you can build the right foundations to ensure that future additions stand firm and lead to a successful project.

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources Branch two: documents

Open source projects usually include the following documents, usually in plain text or markdown format:

  • License: This legal document explains how and to what extent the project can be freely used, modified, and shared. A list of OSI-approved licenses is available on the OSI website. Without an explicit license, your project is not legally open source!
     
  • Code of conduct: This document outlines the rules, norms, acceptable practices, and responsibilities of anyone who decides to participate in the project in any way (including what happens when someone violates any of the rules). The Contributor Covenant is a good example and is open source (licensed under a Creative Commons license).
     
  • README: This file introduces your project to newcomers. On many Git hosting websites, such as GitLab, GitHub, and Codeberg, the README file is displayed under the initial file listing of a repository. It's common to feature documentation here, with links to other necessary documents.
     
  • Documentation: This is a file or directory containing all documentation resources for the project, including guides, API references, tutorials, and so on.
     
  • Contributing: Include a document explaining how to contribute to the project, including installation guides, configuration, and so on.
     
  • Security: Include a file explaining how to submit vulnerability reports or security issues.

Additionally, a project usually has web pages for issues, support, and collaboration.

Broadly, these include:

  • Issues or bug reports: A place where users can report bugs. This page also provides a place developers can go to assign themselves the task of fixing one or more of them.
     
  • Pull or merge requests: A place with proposed feature enhancements and solutions to bugs. These patches may be created by anyone, reviewed by the maintainers, then merged into the project's code.
     
  • Discussions: A place where maintainers, contributors, and users discuss an open source project. This may be a dedicated website or a forum within a collaborative coding site.

Most projects also have a communication channel in the form of an online chat for conversations and interactions between community members.

Licensing

Licensing is perhaps the easiest but most important criterion to consider before creating an open source project. A license defines the terms and conditions that allow the source code and other components of your project to be used, modified, and shared.

Licenses contain tons of legal jargon that many people don't fully understand. I use choosealicense.com, which helps you choose a license based on your intended community, your desire to get patches back from those using your code, or your willingness to allow people to use your code without sharing improvements they make to it.

Image by:

(Bolaji Ayodeji, CC BY-SA 4.0)

13 phases of creating an open source project

Now for the essential question: How do you start an open source software project?

Here is a list of what I consider the phases of an open source project.

  1. Brainstorm your idea, write a synopsis, and document it properly.
     
  2. Begin developing your idea. This usually involves figuring out the right tools and stacks to use, writing some code, version controlling the code, debugging, drinking some coffee, hanging around StackOverflow, using other open source projects, sleeping, and building something to solve a defined problem—or just for fun!
     
  3. Test the project locally, write some unit and integration tests as required, set up CI/CD pipelines as needed, create a staging branch (a test branch where you test the code live before merging into the main branch), and do anything else you need to deploy the project.
     
  4. Write good and effective documentation. This should cover what your project does, why it is useful, how to get started with it (usage, installation, configuration, contributing), and where people can get support.
     
  5. Ensure to document all code conventions you want to use. Enforce them with tools like linters, code formatters, Git hooks, and the commitizen command line utility.
     
  6. Choose the right license and create a README.
     
  7. Publish the project on the internet (you might have a private repository initially, and make it public at this step).
     
  8. Set up the processes for making releases and documenting changelogs (you can use tools like Changesets).
     
  9. Market the project to the world! You can make a post on social media, start a newsletter, share it with your friends privately, do a product hunt launch, live stream, or any other traditional marketing strategy you know.
     
  10. Seek funding support by using any of the available funding platforms, like Open Collective, GitHub Sponsors, Patreon, Buy me a Coffee, LiberaPay, and so on. When you create accounts with these platforms, add a link to it in your project's documentation and website.
     
  11. Build a community around your project.
     
  12. Consider introducing working groups to break your project's management into logical parts when required.
     
  13. Continuously implement new ideas that sustain the resources and people behind your project.

It's important to measure different parts of your project as you progress. This provides you with data you can use for evaluation and future growth strategies.

Now start a project!

I hope this article helps you move forward with that project you've been thinking about.

Feel free to use it as a guide and fill any gaps I missed as you build your awesome open source software project.

Use these steps for a solid foundation for your first—or next—project.

Image by:

Opensource.com

Community management What to read next New open source tool catalogs African language resources I got my first pull request merged! 10 tips for onboarding open source contributors This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

165+ JavaScript terms you need to know

Tue, 09/27/2022 - 15:00
165+ JavaScript terms you need to know Sachin Samal Tue, 09/27/2022 - 03:00

JavaScript is a rich language, with sometimes a seemingly overwhelming number of libraries and frameworks. With so many options available, it's sometimes useful to just look at the language itself and keep in mind its core components. This glossary covers the core JavaScript language, syntax, and functions.

JavaScript variables

var: The most used variable. Can be reassigned but only accessed within a function, meaning function scope. Variables defined with var move to the top when code is executed.

const: Cannot be reassigned and not accessible before they appear within the code, meaning block scope.

let: Similar to const with block scope, however, the let variable can be reassigned but not re-declared.

Data types

Numbers: var age = 33

Variables: var a

Text (strings): var a = "Sachin"

Operations: var b = 4 + 5 + 6

True or false statements: var a = true

Constant numbers: const PI = 3.14

Objects: var fullName = {firstName:"Sachin", lastName: "Samal"}

Objects

This is a simple example of objects in JavaScript. This object describe the variable car, and includes keys or properties such as make, model, and year are the object's property names. Each property has a value, such as Nissan, Altima, and 2022. A JavaScript object is a collection of properties with values, and it functions as a method.

var car = {
make:"Nissan",
model:"Altima",
year:2022,
};Comparison operators

==: Is equal to

===: Is equal value and equal type

!=: Is not equal

!==: Is not equal value or not equal type

>: Is greater than

<: Is less than

>=: Is greater than or equal to

<=: Is less than or equal to

?: Ternary operator

Logical operators

&&: Logical AND

||: Logical OR

!: Logical NOT

Output data

alert(): Output data in an alert box in the browser window

confirm(): Open up a yes/no dialog and return true/false depending on user click

console.log(): Write information to the browser console. Good for debugging.

document.write(): Write directly to the HTML document

prompt(): Create a dialog for user input

Array methods

Array: An object that can hold multiple values at once.

concat(): Join several arrays into one

indexOf(): Return the primitive value of the specified object

join(): Combine elements of an array into a single string and return the string

lastIndexOf(): Give the last position at which a given element appears in an array

pop(): Remove the last element of an array

push(): Add a new element at the end

reverse(): Sort elements in descending order

shift(): Remove the first element of an array

slice(): Pull a copy of a portion of an array into a new array

splice(): Add positions and elements in a specified way

toString(): Convert elements to strings

unshift(): Add a new element to the beginning

valueOf(): Return the first position at which a given element appears in an array

Programming and development Red Hat Developers Blog Programming cheat sheets Try for free: Red Hat Learning Subscription eBook: An introduction to programming with Bash Bash shell scripting cheat sheet eBook: Modernizing Enterprise Java JavaScript loops

Loops: Perform specific tasks repeatedly under applied conditions.

for (before loop; condition for loop; execute after loop) {
// what to do during the loop
}

for: Creates a conditional loop

while: Sets up conditions under which a loop executes at least once, as long as the specified condition is evaluated as true

do while: Similar to the while loop, it executes at least once and performs a check at the end to see if the condition is met. If it is, then it executes again

break: Stop and exit the cycle at certain conditions

continue: Skip parts of the cycle if certain conditions are met

if-else statements

An if statement executes the code within brackets as long as the condition in parentheses is true. Failing that, an optional else statement is executed instead.

if (condition) {
// do this if condition is met
} else {
// do this if condition is not met
}Strings String methods

charAt(): Return a character at a specified position inside a string

charCodeAt(): Give the Unicode of the character at that position

concat(): Concatenate (join) two or more strings into one

fromCharCode(): Return a string created from the specified sequence of UTF-16 code units

indexOf(): Provide the position of the first occurrence of a specified text within a string

lastIndexOf(): Same as indexOf() but with the last occurrence, searching backwards

match(): Retrieve the matches of a string against a search pattern

replace(): Find and replace specified text in a string

search(): Execute a search for a matching text and return its position

slice(): Extract a section of a string and return it as a new string

split(): Split a string object into an array of strings at a specified position

substr(): Extract a substring depended on a specified number of characters, similar to slice()

substring(): Can't accept negative indices, also similar to slice()

toLowerCase(): Convert strings to lower case

toUpperCase(): Convert strings to upper case

valueOf(): Return the primitive value (that has no properties or methods) of a string object

Number methods

toExponential(): Return a string with a rounded number written as exponential notation

toFixed(): Return the string of a number with a specified number of decimals

toPrecision(): String of a number written with a specified length

toString(): Return a number as a string

valueOf(): Return a number as a number

Math methods

abs(a): Return the absolute (positive) value of a

acos(x): Arccosine of x, in radians

asin(x): Arcsine of x, in radians

atan(x): Arctangent of x as a numeric value

atan2(y,x): Arctangent of the quotient of its arguments

ceil(a): Value of a rounded up to its nearest integer

cos(a): Cosine of a (x is in radians)

exp(a): Value of Ex

floor(a): Value of a rounded down to its nearest integer

log(a): Natural logarithm (base E) of a

max(a,b,c…,z): Return the number with the highest value

min(a,b,c…,z): Return the number with the lowest value

pow(a,b): a to the power of b

random(): Return a random number between 0 and 1

round(a): Value of a rounded to its nearest integer

sin(a): Sine of a (a is in radians)

sqrt(a): Square root of a

tan(a): Tangent of an angle

Dealing with dates in JavaScript Set dates

Date(): Create a new date object with the current date and time

Date(2022, 6, 22, 4, 22, 11, 0): Create a custom date object. The numbers represent year, month, day, hour, minutes, seconds, milliseconds. You can omit anything except for year and month.

Date("2022-07-29"): Date declaration as a string

Pull date and time values

getDate(): Day of the month as a number (1-31)

getDay(): Weekday as a number (0-6)

getFullYear(): Year as a four-digit number (yyyy)

getHours(): Hour (0-23)

getMilliseconds(): Millisecond (0-999)

getMinutes(): Minute (0-59)

getMonth(): Month as a number (0-11)

getSeconds(): Second (0-59)

getTime(): Milliseconds since January 1, 1970

getUTCDate(): Day (date) of the month in the specified date according to universal time (also available for day, month, full year, hours, minutes, etc.)

parse: Parse a string representation of a date and return the number of milliseconds since January 1, 1970

Set part of a date

setDate(): Set the day as a number (1-31)

setFullYear(): Set the year (optionally month and day)

setHours(): Set the hour (0-23)

setMilliseconds(): Set milliseconds (0-999)

setMinutes(): Set the minutes (0-59)

setMonth(): Set the month (0-11)

setSeconds(): Set the seconds (0-59)

setTime(): Set the time (milliseconds since January 1, 1970)

setUTCDate(): Set the day of the month for a specified date according to universal time (also available for day, month, full year, hours, minutes, etc.)

Dom mode Node methods

appendChild(): Add a new child node to an element as the last child node

cloneNode(): Clone an HTML element

compareDocumentPosition(): Compare the document position of two elements

getFeature(): Return an object which implements the APIs of a specified feature

hasAttributes(): Return true if an element has any attributes, otherwise false

hasChildNodes(): Return true if an element has any child nodes, otherwise false

insertBefore(): Insert a new child node before a specified, existing child node

isDefaultNamespace(): Return true if a specified namespaceURI is the default, otherwise false

isEqualNode(): Check if two elements are equal

isSameNode(): Check if two elements are the same node

isSupported(): Return true if a specified feature is supported on the element

lookupNamespaceURI(): Return the namespaceURI associated with a given node

normalize(): Join adjacent text nodes and removes empty text nodes in an element

removeChild(): Remove a child node from an element

replaceChild(): Replace a child node in an element

Element methods

getAttribute(): Return the specified attribute value of an element node

getAttributeNS(): Return string value of the attribute with the specified namespace and name

getAttributeNode(): Get the specified attribute node

getAttributeNodeNS(): Return the attribute node for the attribute with the given namespace and name

getElementsByTagName(): Provide a collection of all child elements with the specified tag name

getElementsByTagNameNS(): Return a live HTMLCollection of elements with a certain tag name belonging to the given namespace

hasAttribute(): Return true if an element has any attributes, otherwise false

hasAttributeNS(): Provide a true/false value indicating whether the current element in a given namespace has the specified attribute

removeAttribute(): Remove a specified attribute from an element

lookupPrefix(): Return a DOMString containing the prefix for a given namespaceURI, if present

removeAttributeNS(): Remove the specified attribute from an element within a certain namespace

removeAttributeNode(): Take away a specified attribute node and return the removed node

setAttribute(): Set or change the specified attribute to a specified value

setAttributeNS(): Add a new attribute or changes the value of an attribute with the given namespace and name

setAttributeNode(): Set or change the specified attribute node

setAttributeNodeNS(): Add a new namespaced attribute node to an element

JavaScript events Mouse

onclick: User clicks on an element

oncontextmenu: User right-clicks on an element to open a context menu

ondblclick: User double-clicks on an element

onmousedown: User presses a mouse button over an element

onmouseenter: Pointer moves onto an element

onmouseleave: Pointer moves out of an element

onmousemove: Pointer moves while it is over an element

onmouseover: Pointer moves onto an element or one of its children

setInterval(): Call a function or evaluates an expression at

oninput: User input on an element

onmouseup: User releases a mouse button while over an element

onmouseout: User moves the mouse pointer out of an element or one of its children

onerror: Happens when an error occurs while loading an external file

onloadeddata: Media data is loaded

onloadedmetadata: Metadata (like dimensions and duration) is loaded

onloadstart: Browser starts looking for specified media

onpause: Media is paused either by the user or automatically

onplay: Media is started or is no longer paused

onplaying: Media is playing after having been paused or stopped for buffering

onprogress: Browser is in the process of downloading the media

onratechange: Media play speed changes

onseeked: User finishes moving/skipping to a new position in the media

onseeking: User starts moving/skipping

onstalled: Browser tries to load the media, but it is not available

onsuspend — Browser is intentionally not loading media

ontimeupdate: Play position has changed (e.g., because of fast forward)

onvolumechange: Media volume has changed (including mute)

onwaiting: Media paused but expected to resume (for example, buffering)

Keep this JavaScript glossary bookmarked to reference variables, methods, strings, and more.

Image by:

Photo by Jen Wike Huger

JavaScript What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Get change alerts from any website with this open source tool

Tue, 09/27/2022 - 15:00
Get change alerts from any website with this open source tool Leigh Morresi Tue, 09/27/2022 - 03:00

The year was 2020, and news about COVID-19 came flooding in so quickly that everyone felt completely overwhelmed with similar news articles providing updates with varying degrees of accuracy.

But all I needed to know was when my official government guidelines changed. In the end, that's all that mattered to me.

Whether the concern is a pandemic or just the latest tech news, keeping ahead of changes in website content can be critical.

The changedetection.io project provides a simple yet highly capable, open source solution for website change detection and notification. It's easy to set up, and it can notify over 70 (and counting) different notification systems, such as Matrix, Mattermost, Nextcloud, Signal, Zulip, Home Assistant, email, and more. It also notifies proprietary applications like Discord, Office365, Reddit, Telegram, and many others.

But changedetection.io isn't just limited to watching web page content. You can also monitor XML and JSON feeds, and it will build an RSS feed of the websites that changed.

More great content Free online course: RHEL technical overview Learn advanced Linux commands Download cheat sheets Find an open source alternative Explore open source resources

Thanks to its built-in JSON simple storage system, there's no need to set up complicated databases to receive and store information. You can run it as a Docker image or install it with pip. The project has an extensive wiki help section, and most common questions are covered there.

For sites using complex JavaScript, you can connect your changedetection.io installation to a Chromium or Chrome browser with the built-in Playwright content fetcher.

Once running, access the application in your browser (http://localhost:5000, by default). You can set a password in the Settings section if your computer can be reached from an outside network.

Image by:

(Leigh Morresi, CC BY-SA 4.0)

Submit the URL of a page you want to monitor. There are several settings related to how the page is filtered. For example, you more than likely do not want to know when a company's stock price listed in their site footer has changed, but you may want to know when they post a news article to their blog.

Monitor a site

Imagine you want to add your favorite website, Opensource.com, to be monitored. You only want to know when the main call-out article contains the word "python" and you want to be notified over Matrix.

To do this, begin with the visual-selector tool. (This requires the playwright browser interface to be connected.)

Image by:

(Leigh Morresi, CC BY-SA 4.0)

The visual-selector tool automatically calculates the best Xpath or CSS filter to target the content. Otherwise, you would get a lot of noise from the daily page updates.

Next, visit the Filters & Triggers tab.

Image by:

(Leigh Morresi, CC BY-SA 4.0)

In CSS/JSON/XPATH Filter field (the blue circle), you can see the automatically generated CSS filter from the previous step.

There are several useful filters available, such as Remove elements (good for removing noisy elements), Ignore text, Trigger/wait for text, and Block change-detection if text matches (used for waiting for some text to disappear, like "sold out").

In Trigger/wait for text (the red circle), type in the keyword you want to monitor for. (That's "python" in this example.)

The final step is in the Notifications tab, where you configure where you want to receive your notification. Below I added a Matrix room as the notification target, using the Matrix API.

Image by:

(Leigh Morresi, CC BY-SA 4.0)

The notification URL is in the format of matrixs://username:password@matrix.org/#/room/#room-name:matrix.org

However, t2Bot format is also supported. Here are more Matrix notification options.

And that's it! Now you'll receive a message over Matrix whenever the content changes.

There's more

There's so much more to changedetection.io. If you prefer calling a custom JSON API, you don't have to use an API for notifications (use jsons://). You can also create a custom HTTP request (POST and GET), execute JavaScript before checking (perhaps to pre-fill a username and password login field), and many more interesting features, with more to come.

Stop browsing the web and start watching the web instead!

Use changedetection.io to get alerts when a website makes changes or updates.

Tools Web development What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

OpenSSF: on a mission to improve security of open source software

Mon, 09/26/2022 - 15:00
OpenSSF: on a mission to improve security of open source software Gaurav Kamathe Mon, 09/26/2022 - 03:00

Open source software (OSS), once a niche segment of the development landscape, is now ubiquitous. This growth is fantastic for the open source community. However, as the usage of OSS increases, so do concerns about security. Especially in mission-critical applications— think medical devices, automobiles, space flight, and nuclear facilities—securing open source technology is of the utmost priority. No individual entity, whether developers, organizations, or governments, can single-handedly solve this problem. The best outcome is possible when all of them come together to collaborate.

The Open Source Security Foundation (OpenSSF) formed to facilitate this collaboration. OpenSSF is best described in its own words:

The OpenSSF is a cross-industry collaboration that brings together leaders to improve the security of open source software by building a broader community with targeted initiatives and best practices.

Vision

The technical vision of OpenSSF is to handle security proactively, by default. Developers are rightly at the center of this vision. OpenSSF seeks to empower developers to learn secure development practices and automatically receive guidance on them through the day-to-day tools they use. Researchers who identify security issues can send this information backward through the supply chain to someone who can rapidly address the issue. Auditors and regulators are encouraged to devise security policies that can be easily enforced via tooling, and community members provide information on the components they use and test regularly.

Mobilization plan

OpenSSF drafted a mobilization plan based on input from open source developers and leaders from US federal agencies. The result is a set of high-impact actions aimed at improving the resiliency and security of open source software. Based on this plan, 10 streams of investments have been identified, including security education, risk assessment, memory safety, and supply chain improvement. While discussion of these issues is widespread, OpenSSF is the platform that has collected and prioritized these concerns over others to ensure a secure open source ecosystem.

Working groups

Because the 10 streams of investments are quite diverse, OpenSSF is divided into multiple working groups. This strategy allows individual teams to focus on a specific area of expertise and move forward without getting bogged down with more general concerns. The working groups have something for everyone: Developers can contribute to security tooling, maintainers can handle software repositories, and others can contribute by educating developers on best practices, identifying metrics for open source projects, or identifying and securing the critical projects that form the core of the OSS ecosystem.

More on security The defensive coding guide 10 layers of Linux container security SELinux coloring book More security articles Industry participation

Multiple software vendors have become members of OpenSSF in their own capacity. These vendors are important players in the IT ecosystem, ranging from cloud service providers and operating system vendors to companies hosting OSS repositories, creating security tooling, creating computing hardware, and more. The benefit is getting inputs from a variety of sources that others might not be aware of and then collaboratively working on those issues.

Getting involved

There are a variety of ways to participate in the OpenSSF initiative based on your expertise and the amount of time you can set aside for it:

  • Sign up for their mailing list to follow the latest updates and discussions and update your calendar with OpenSSF meetings.
  • If you are looking for more interactive communication, consider joining their Slack channel.
  • Browse through their past meetings on their YouTube channel.
  • Organizations can consider becoming a member of OpenSSF.
  • Developers can quickly look up the GitHub repo for the software projects they are working on.
  • Most important, consider joining a working group of your choice and make a difference.
Conclusion

The security industry is growing and needs active participation from the open source community. If you are starting out or wish to specialize in security, OpenSSF provides a platform to work on the right problems in the security space under the guidance of experienced peers in security.

Developers, businesses, and government agencies are working together to ensure the security of open source software, and you can join them.

Image by:

Tumisu. CC0

Security and privacy What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

The story behind Joplin, the open source note-taking app

Mon, 09/26/2022 - 15:00
The story behind Joplin, the open source note-taking app Richard Chambers Mon, 09/26/2022 - 03:00

In this interview, I met up with Laurent Cozic, creator of the note-taking app, Joplin. Joplin was a winner of the 20i rewards, so I wanted to find out what makes it such a success, and how he achieved it.

Could you summarize what Joplin does?

Joplin is an open source note-taking app. It allows you to capture your thoughts and securely access them from any device.

Obviously, there are other note-taking apps out there—but apart from it being free to use, what makes it different?

The fact that it is open source is an important aspect for many of our users, because it means there is no vendor locking on the data, and that data can be easily exported and accessed in various ways.

We also focus on security and data privacy, in particular with the synchronization end-to-end encryption feature, and by being transparent about any connection that the application makes. We also work with security researchers to keep the app more secure.

Finally, Joplin can be customized in several different ways—through plugins, which can add new functionalities, and themes to customize the app appearance. We also expose a data API, which allows third-party applications to access Joplin data.

[ Related read 5 note-taking apps for Linux ]

It's a competitive market, so what inspired you to build it?

It happened organically. I started looking into it in 2016, as I was looking at existing commercial note-taking applications, and I didn't like that the notes, attachments, or tags could not easily be exported or manipulated by other tools.

This is probably due to vendor locking and partly a lack of motivation from the vendor since they have no incentive to help users move their data to other apps. There is also an issue with the fact that these companies usually will keep the notes in plain text, and that can potentially cause issues in terms of data privacy and security.

So I decided to start creating a simple mobile and terminal application with sync capabilities to have my notes easily accessible on my devices. Later the desktop app was created and the project grew from there.

Image by:

(Opensource.com, CC BY-SA 4.0)

How long did Joplin take to make?

I've been working on it on and off since 2016 but it wasn't full time. The past two years I've been focusing more on it.

What advice might you have for someone setting to create their own open source app?

Pick a project you use yourself and technologies you enjoy working with.

Managing an open source project can be difficult sometimes so there has to be this element of fun to make it worthwhile. Then I guess "release early, release often" applies here, so that you can gauge user's interest and whether it makes sense to spend time developing the project further.

How many people are involved in Joplin's development?

There are 3-4 people involved in the development. At the moment we also have six students working on the project as part of Google Summer of Code.

[ Also read Our journey to open source during Google Summer of Code ]

Lots of people create open source projects, yet Joplin has been a resounding success for you. Could you offer creators any tips on how to get noticed?

There's no simple formula and to be honest I don't think I could replicate the success in a different project! You've got to be passionate about what you're doing but also be rigorous, be organized, make steady progress, ensure the code quality remains high, and have a lot of test units to prevent regressions.

Also be open to the user feedback you receive, and try to improve the project based on it.

Once you've got all that, the rest is probably down to luck—if it turns out you're working on a project that interests a lot of people, things might work out well!

Once you get noticed, how do you keep that momentum going, if you don't have a traditional marketing budget?

I think it's about listening to the community around the project. For example I never planned to have a forum but someone suggested it on GitHub, so I made one and it became a great way to share ideas, discuss features, provide support, and so on. The community is generally welcoming of newcomers too, which creates a kind of virtuous circle.

Next to this, it's important to communicate regularly about the project.

We don't have a public roadmap, because the ETA for most features is generally "I don't know", but I try to communicate about coming features, new releases, and so on. We also communicate about important events, the Google Summer of Code in particular, or when we have the chance to win something like the 20i FOSS Awards.

Finally, very soon we'll have an in-person meetup in London, which is another way to keep in touch with the community and collaborators.

How does user feedback influence the roadmap?

Significantly. Contributors will often work on something simply because they need the feature. But next to this, we also keep track of the features that seem most important to users, based on what we read about on the forum and on the GitHub issue tracker.

For example, the mobile app is now high priority because we frequently hear from users that its limitations and issues are a problem to effectively use Joplin.

Image by:

(Opensource.com, CC BY-SA 4.0)

How do you keep up to date with the latest in dev and coding?

Mostly by reading Hacker News!

Do you have a personal favorite FOSS that you'd recommend?

Among the less well-known projects, SpeedCrunch is very good as a calculator. It has a lot of features and it's great how it keeps a history of all previous calculations.

I also use KeepassXC as a password manager. It has been improving steadily over the past few years.

Finally, Visual Studio Code is great as a cross-platform text editor.

More Linux resources Linux commands cheat sheet Advanced Linux commands cheat sheet Free online course: RHEL technical overview Linux networking cheat sheet SELinux cheat sheet Linux common commands cheat sheet What are Linux containers? Our latest Linux articles

I'd assumed that Joplin was named after Janis, but Wikipedia tells me it's Scott Joplin. What made you choose the name?

I wanted to name it "jot-it" at first but I think the name was already taken.

Since I was listening to Scott Joplin ragtime music a lot back then (I was pretty much obsessed with it), I decided to use his name.

I think the meaning of a product name is not too important, as long as the name itself is easy to write, pronounce, remember, and perhaps is associated with something positive (or at least nothing negative).

And I think "Joplin" ticks all these boxes.

Is there anything you can say about plans for Joplin? An exclusive tease of a new feature, perhaps?

As mentioned earlier, we are very keen to make improvements to the mobile app, both in terms of UX design and new features.

We're also looking at creating a "Plugin Store" to make it easier to browse and install plugins.

Thanks for your time Laurent— best of luck with the future of Joplin.

This interview was originally published on the 20i blog and has been republished with permission.

Laurent Cozic sat down with me to discuss how Joplin got started and what's next for the open source note-taking app.

Image by:

Opensource.com

Art and design What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Drop your database for PostgreSQL

Sat, 09/24/2022 - 15:00
Drop your database for PostgreSQL Seth Kenlon Sat, 09/24/2022 - 03:00

Databases are tools to store information in an organized but flexible way. A spreadsheet is essentially a database, but the constraints of a graphical application render most spreadsheet applications useless to programmers. With Edge and IoT devices becoming significant target platforms, developers need powerful but lightweight solutions for storing, processing, and querying large amounts of data. One of my favourite combinations is the PostgreSQL database and Lua bindings, but the possibilities are endless. Whatever language you use, Postgres is a great choice for a database, but you need to know some basics before adopting it.

Install Postgres

To install PostgreSQL on Linux, use your software repository. On Fedora, CentOS, Mageia, and similar:

$ sudo dnf install postgresql postgresql-server

On Debian, Linux Mint, Elementary, and similar:

$ sudo apt install postgresql postgresql-contrib

On macOS and Windows, download an installer from postgresql.org.

Setting up Postgres

Most distributions install the Postgres database without starting it, but provide you with a script or systemd service to help it start reliably. However, before you start PostgreSQL, you must create a database cluster.

Fedora

On Fedora, CentOS, or similar, there's a Postgres setup script provided in the Postgres package. Run this script for easy configuration:

$ sudo /usr/bin/postgresql-setup --initdb
[sudo] password:
 * Initializing database in '/var/lib/pgsql/data'
 * Initialized, logs are in /var/lib/pgsql/initdb_postgresql.logDebian

On Debian-based distributions, setup is performed automatically by apt during installation.

Everything else

Finally, if you're running something else, then you can just use the toolchain provided by Postgres itself. The initdb command creates a database cluster, but you must run it as the postgres user, an identity you may temporarily assume using sudo:

$ sudo -u postgres \
"initdb -D /var/lib/pgsql/data \
--locale en_US.UTF-8 --auth md5 --pwprompt"Start Postgres

Now that a cluster exists, start the Postgres server using either the command provided to you in the output of initdb or with systemd:

$ sudo systemctl start postgresqlCreating a database user

To create a Postgres user, use the createuser command. The postgres user is the superuser of the Postgres install,

$ sudo -u postgres createuser --interactive --password bogus
Shall the new role be a superuser? (y/n) n
Shall the new role be allowed to create databases? (y/n) y
Shall the new role be allowed to create more new roles? (y/n) n
Password:Create a database

To create a new database, use the createdb command. In this example, I create the database exampledb and assign ownership of it to the user bogus:

$ createdb exampledb --owner bogusInteracting with PostgreSQL

You can interact with a PostgreSQL database using the psql command. This command provides an interactive shell so you can view and update your databases. To connect to a database, specify the user and database you want to use:

$ psql --user bogus exampledb
psql (XX.Y)
Type "help" for help.

exampledb=>Create a table

Databases contain tables, which can be visualized as a spreadsheet. There's a series of rows (called records in a database) and columns. The intersection of a row and a column is called a field.

The Structured Query Language (SQL) is named after what it provides: A method to inquire about the contents of a database in a predictable and consistent syntax to receive useful results.

Currently, your database is empty, devoid of any tables. You can create a table with the CREATE query. It's useful to combine this with the IF NOT EXISTS statement, which prevents PostgreSQL from clobbering an existing table.

Before you createa table, think about what kind of data (the "data type" in SQL terminology) you anticipate the table to contain. In this example, I create a table with one column for a unique identifier and one column for some arbitrary text up to nine characters.

exampledb=> CREATE TABLE IF NOT EXISTS my_sample_table(
exampledb(> id SERIAL,
exampledb(> wordlist VARCHAR(9) NOT NULL
);

The SERIAL keyword isn't actually a data type. It's special notation in PostgreSQL that creates an auto-incrementing integer field. The VARCHAR keyword is a data type indicating a variable number of characters within a limit. In this code, I've specified a maximum of 9 characters. There are lots of data types in PostgreSQL, so refer to the project documentation for a list of options.

Insert data

You can populate your new table with some sample data by using the INSERT SQL keyword:

exampledb=> INSERT INTO my_sample_table (wordlist) VALUES ('Alice');
INSERT 0 1

Your data entry fails, should you attempt to put more than 9 characters into the wordlist field:

exampledb=> INSERT INTO my_sample_table (WORDLIST) VALUES ('Alexandria');
ERROR:  VALUE too long FOR TYPE CHARACTER VARYING(9)Alter a table or column

When you need to change a field definition, you use the ALTER SQL keyword. For instance, should you decide that a nine character limit for wordlist, you can increase its allowance by setting its data type:

exampledb=> ALTER TABLE my_sample_table
ALTER COLUMN wordlist SET DATA TYPE VARCHAR(10);
ALTER TABLE
exampledb=> INSERT INTO my_sample_table (WORDLIST) VALUES ('Alexandria');
INSERT 0 1View data in a table

SQL is a query language, so you view the contents of a database through queries. Queries can be simple, or it can involve joining complex relationships between several different tables. To see everything in a table, use the SELECT keyword on * (an asterisk is a wildcard):

exampledb=> SELECT * FROM my_sample_table;
 id |  wordlist
\----+------------
  1 | Alice
  2 | Bob
  3 | Alexandria
(3 ROWS)More data

PostgreSQL can handle a lot of data, but as with any database the key to success is how you design your database for storage and what you do with the data once you've got it stored. A relatively large public data set can be found on OECD.org, and using this you can try some advanced database techniques.

First, download the data as comma-separated values (CSV) and save the file as land-cover.csv in your Downloads folder.

Browse the data in a text editor or spreadsheet application to get an idea of what columns there are, and what kind of data each column contains. Look at the data carefully and keep an eye out for exceptions to an apparent rule. For instance, the COU column, containing a country code such as AUS for Australia and GRC for Greece, tends to be 3 characters until the oddity BRIICS.

Once you understand the data you're working with, you can prepare a Postgres database:

$ createdb landcoverdb --owner bogus
$ psql --user bogus landcoverdb
landcoverdb=> create table land_cover(
country_code varchar(6),
country_name varchar(76),
small_subnational_region_code varchar(5),
small_subnational_region_name varchar(14),
large_subnational_region_code varchar(17),
large_subnational_region_name varchar(44),
measure_code varchar(13),
measure_name varchar(29),
land_cover_class_code varchar(17),
land_cover_class_name varchar(19),
year_code integer,
year_value integer,
unit_code varchar(3),
unit_name varchar(17),
power_code integer,
power_name varchar(9),
reference_period_code varchar(1),
reference_period_name varchar(1),
value float(8),
flag_codes varchar(1),
flag_names varchar(1));Importing data

Postgres can import CSV data directly using the special metacommand \copy:

landcoverdb=> \copy land_cover from '~/land-cover.csv' with csv header delimiter ','
COPY 22113

That's 22,113 records imported. Seems like a good start!

More on edge computing Understanding edge computing Why Linux is critical to edge computing eBook: Running Kubernetes on your Raspberry Pi Download now: The automated enterprise eBook eBook: A practical guide to home automation using open source tools eBook: 7 examples of automation on the edge The latest on edge Querying data

A broad SELECT statement to see all columns of all 22,113 records is possible, and Postgres very nicely pipes the output to a screen pager so you can scroll through the output at a leisurely pace. However, using advanced SQL you can get some useful views of what's otherwise some pretty raw data.

landcoverdb=> SELECT
    lcm.country_name,
    lcm.year_value,
    SUM(lcm.value) sum_value
FROM land_cover lcm
JOIN (
    SELECT
        country_name,
        large_subnational_region_name,
        small_subnational_region_name,
        MAX(year_value) max_year_value
    FROM land_cover
    GROUP BY country_name,
        large_subnational_region_name,
        small_subnational_region_name
) AS lcmyv
ON
    lcm.country_name = lcmyv.country_name AND
    lcm.large_subnational_region_name = lcmyv.large_subnational_region_name AND
    lcm.small_subnational_region_name = lcmyv.small_subnational_region_name AND
    lcm.year_value = lcmyv.max_year_value
GROUP BY lcm.country_name,
    lcm.large_subnational_region_name,
    lcm.small_subnational_region_name,
    lcm.year_value
ORDER BY country_name,
    year_value;

Here's some sample output:

\---------------+------------+------------
 Afghanistan    |       2019 |  743.48425
 Albania        |       2019 |  128.82532
 Algeria        |       2019 |  2417.3281
 American Samoa |       2019 |   100.2007
 Andorra        |       2019 |  100.45613
 Angola         |       2019 |  1354.2192
 Anguilla       |       2019 | 100.078514
 Antarctica     |       2019 |  12561.907
[...]

SQL is a rich langauge, and so it's beyond the scope of this article. Read through the SQL code and see if you can modify it to provide a different set of data.

Open database

PostgreSQL is one of the great open source databases. With it, you can design repositories for structured data, and then use SQL to view it in different ways so you can gain fresh perspectives on that data. Postgres integrates with many languages, including Python, Lua, Groovy, Java, and more, so regardless of your toolset, you can probably make use of this excellent database.

Postgres is one of the most flexible databases available, and it's open source.

Image by:

Image by Mapbox Uncharted ERG, CC-BY 3.0 US

Databases Edge computing What to read next Install MariaDB or MySQL on Linux 3 ways to use PostgreSQL commands This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. 6870 points (Correspondent) Vancouver, Canada

Seldom without a computer of some sort since graduating from the University of British Columbia in 1978, I have been a full-time Linux user since 2005, a full-time Solaris and SunOS user from 1986 through 2005, and UNIX System V user before that.

On the technical side of things, I have spent a great deal of my career as a consultant, doing data analysis and visualization; especially spatial data analysis. I have a substantial amount of related programming experience, using C, awk, Java, Python, PostgreSQL, PostGIS and lately Groovy. I'm looking at Julia with great interest. I have also built a few desktop and web-based applications, primarily in Java and lately in Grails with lots of JavaScript on the front end and PostgreSQL as my database of choice.

Aside from that, I spend a considerable amount of time writing proposals, technical reports and - of course - stuff on https://www.opensource.com.

User Attributes Correspondent Open Sourcerer People's Choice Award 100+ Contributions Club Emerging Contributor Award 2016 Author Comment Gardener Correspondent Columnist Contributor Club Register or Login to post a comment.

How to build a dynamic distributed database with DistSQL

Fri, 09/23/2022 - 15:00
How to build a dynamic distributed database with DistSQL Raigor Jiang Fri, 09/23/2022 - 03:00

Distributed databases are common for many reasons. They increase reliability, redundancy, and performance. Apache ShardingSphere is an open source framework that enables you to transform any database into a distributed database. Since the release of ShardingSphere 5.0.0, DistSQL (Distributed SQL) has provided dynamic management for the ShardingSphere ecosystem.

In this article, I demonstrate a data sharding scenario in which DistSQL's flexibility allows you to create a distributed database. At the same time, I show some syntax sugar to simplify operating procedures, allowing your potential users to choose their preferred syntax.

A series of DistSQL statements are run through practical cases to give you a complete set of practical DistSQL sharding management methods, which create and maintain distributed databases through dynamic management.

Image by:

(Jiang Longtao, CC BY-SA 4.0)

What is sharding?

In database terminology, sharding is the process of partitioning a table into separate entities. While the table data is directly related, it often exists on different physical database nodes or, at the very least, within separate logical partitions.

Practical case example

To follow along with this example, you must have these components in place, either in your lab or in your mind as you read this article:

  • Two sharding tables: t_order and t_order_item.
  • For both tables, database shards are carried out with the user_id field, and table shards with the order_id field.
  • The number of shards is two databases times three tables.
Image by:

(Jiang Longtao, CC BY-SA 4.0)

Set up the environment

1. Prepare a database (MySQL, MariaDB, PostgreSQL, or openGauss) instance for access. Create two new databases: demo_ds_0 and demo_ds_1.

2. Deploy Apache ShardingSphere-Proxy 5.1.2 and Apache ZooKeeper. ZooKeeper acts as a governance center and stores ShardingSphere metadata information.

3. Configure server.yaml in the Proxy conf directory as follows:

mode:
  type: Cluster
  repository:
    type: ZooKeeper
    props:
      namespace: governance_ds
      server-lists: localhost:2181 #ZooKeeper address
      retryIntervalMilliseconds: 500
      timeToLiveSeconds: 60
      maxRetries: 3
      operationTimeoutMilliseconds: 500
  overwrite: falserules:
 - !AUTHORITY
    users:
     - root@%:root

4. Start ShardingSphere-Proxy and connect it to Proxy using a client, for example:

$ mysql -h 127.0.0.1 -P 3307 -u root -p

5. Create a distributed database:

CREATE DATABASE sharding_db;USE sharding_db;Add storage resources

Next, add storage resources corresponding to the database:

ADD RESOURCE ds_0 (
    HOST=127.0.0.1,
    PORT=3306,
    DB=demo_ds_0,
    USER=root,
    PASSWORD=123456
), ds_1(
    HOST=127.0.0.1,
    PORT=3306,
    DB=demo_ds_1,
    USER=root,
    PASSWORD=123456
);

View the storage resources:

mysql> SHOW DATABASE RESOURCES\G;
******** 1. row ***************************
         name: ds_1
         type: MySQL
         host: 127.0.0.1
         port: 3306
           db: demo_ds_1
          -- Omit partial attributes
******** 2. row ***************************
         name: ds_0
         type: MySQL
         host: 127.0.0.1
         port: 3306
           db: demo_ds_0
          -- Omit partial attributes

Adding the optional \G switch to the query statement makes the output format easy to read.

More on data science What is data science? What is Python? How to become a data scientist Data scientist: A day in the life What is big data? Whitepaper: Data-intensive intelligent applications in a hybrid cloud blueprint MariaDB and MySQL cheat sheet Latest data science articles Create sharding rules

ShardingSphere's sharding rules support regular sharding and automatic sharding. Both sharding methods have the same effect. The difference is that the configuration of automatic sharding is more concise, while regular sharding is more flexible and independent.

Refer to the following links for more details on automatic sharding:

Next, it's time to adopt regular sharding and use the INLINE expression algorithm to implement the sharding scenarios described in the requirements.

Primary key generator

The primary key generator creates a secure and unique primary key for a data table in a distributed scenario. For details, refer to the document Distributed Primary Key.

1. Create a primary key generator:

CREATE SHARDING KEY GENERATOR snowflake_key_generator (
TYPE(NAME=SNOWFLAKE)
);

2. Query the primary key generator:

mysql> SHOW SHARDING KEY GENERATORS;
+-------------------------+-----------+-------+
| name                    | type      | props |
+-------------------------+-----------+-------+
| snowflake_key_generator | snowflake | {}    |
+-------------------------+-----------+-------+
1 row in set (0.01 sec)Sharding algorithm

1. Create a database sharding algorithm used by t_order and t_order_item in common:

-- Modulo 2 based on user_id in database sharding
CREATE SHARDING ALGORITHM database_inline (
TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="ds_${user_id % 2}"))
);

2. Create different table shards algorithms for t_order and t_order_item:

-- Modulo 3 based on order_id in table sharding
CREATE SHARDING ALGORITHM t_order_inline (
TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_${order_id % 3}"))
);
CREATE SHARDING ALGORITHM t_order_item_inline (
TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_item_${order_id % 3}"))
);

3. Query the sharding algorithm:

mysql> SHOW SHARDING ALGORITHMS;
+---------------------+--------+---------------------------------------------------+
| name                | type   | props                                             |
+---------------------+--------+---------------------------------------------------+
| database_inline     | inline | algorithm-expression=ds_${user_id % 2}            |
| t_order_inline      | inline | algorithm-expression=t_order_${order_id % 3}      |
| t_order_item_inline | inline | algorithm-expression=t_order_item_${order_id % 3} |
+---------------------+--------+---------------------------------------------------+
3 rows in set (0.00 sec)Create a default sharding strategy

The sharding strategy consists of a sharding key and sharding algorithm, which in this case is databaseStrategy and tableStrategy. Because t_order and t_order_item have the same database sharding field and sharding algorithm, create a default strategy to be used by all shard tables with no sharding strategy configured.

1. Create a default database sharding strategy:

CREATE DEFAULT SHARDING DATABASE STRATEGY (
TYPE=STANDARD,SHARDING_COLUMN=user_id,SHARDING_ALGORITHM=database_inline
);

2. Query default strategy:

mysql> SHOW DEFAULT SHARDING STRATEGY\G;
*************************** 1. row ***************************
                    name: TABLE
                    type: NONE
         sharding_column:
 sharding_algorithm_name:
 sharding_algorithm_type:
sharding_algorithm_props:
*************************** 2. row ***************************
                    name: DATABASE
                    type: STANDARD
         sharding_column: user_id
 sharding_algorithm_name: database_inline
 sharding_algorithm_type: inline
sharding_algorithm_props: {algorithm-expression=ds_${user_id % 2}}
2 rows in set (0.00 sec)

You have not configured the default table sharding strategy, so the default strategy of TABLE is NONE.

Set sharding rules

The primary key generator and sharding algorithm are both ready. Now you can create sharding rules. The method I demonstrate below is a little complicated and involves multiple steps. In the next section, I'll show you how to create sharding rules in just one step, but for now, witness how it's typically done.

First, define t_order:

CREATE SHARDING TABLE RULE t_order (
DATANODES("ds_${0..1}.t_order_${0..2}"),
TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM=t_order_inline),
KEY_GENERATE_STRATEGY(COLUMN=order_id,KEY_GENERATOR=snowflake_key_generator)
);

Here is an explanation of the values found above:

  • DATANODES specifies the data nodes of shard tables.
  • TABLE_STRATEGY specifies the table strategy, among which SHARDING_ALGORITHM uses created sharding algorithm t_order_inline.
  • KEY_GENERATE_STRATEGY specifies the primary key generation strategy of the table. Skip this configuration if primary key generation is not required.

Next, define t_order_item:

CREATE SHARDING TABLE RULE t_order_item (
DATANODES("ds_${0..1}.t_order_item_${0..2}"),
TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM=t_order_item_inline),
KEY_GENERATE_STRATEGY(COLUMN=order_item_id,KEY_GENERATOR=snowflake_key_generator)
);

Query the sharding rules to verify what you've created:

mysql> SHOW SHARDING TABLE RULES\G;
************************** 1. row ***************************
                           table: t_order
               actual_data_nodes: ds_${0..1}.t_order_${0..2}
             actual_data_sources:
          database_strategy_type: STANDARD
        database_sharding_column: user_id
database_sharding_algorithm_type: inline
database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}
              table_strategy_type: STANDARD
            table_sharding_column: order_id
    table_sharding_algorithm_type: inline
   table_sharding_algorithm_props: algorithm-expression=t_order_${order_id % 3}
              key_generate_column: order_id
               key_generator_type: snowflake
              key_generator_props:
*************************** 2. row ***************************
                            table: t_order_item
                actual_data_nodes: ds_${0..1}.t_order_item_${0..2}
              actual_data_sources:
           database_strategy_type: STANDARD
         database_sharding_column: user_id
 database_sharding_algorithm_type: inline
database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}
              table_strategy_type: STANDARD
            table_sharding_column: order_id
    table_sharding_algorithm_type: inline
   table_sharding_algorithm_props: algorithm-expression=t_order_item_${order_id % 3}
              key_generate_column: order_item_id
               key_generator_type: snowflake
              key_generator_props:
2 rows in set (0.00 sec)

This looks right so far. You have now configured the sharding rules for t_order and t_order_item.

You can skip the steps for creating the primary key generator, sharding algorithm, and default strategy, and complete the sharding rules in one step. Here's how to make it easier.

Sharding rule syntax

For instance, if you want to add a shard table called t_order_detail, you can create sharding rules as follows:

CREATE SHARDING TABLE RULE t_order_detail (
DATANODES("ds_${0..1}.t_order_detail_${0..1}"),
DATABASE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=user_id,SHARDING_ALGORITHM(TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="ds_${user_id % 2}")))),
TABLE_STRATEGY(TYPE=STANDARD,SHARDING_COLUMN=order_id,SHARDING_ALGORITHM(TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="t_order_detail_${order_id % 3}")))),
KEY_GENERATE_STRATEGY(COLUMN=detail_id,TYPE(NAME=snowflake))
);

This statement specifies a database sharding strategy, table strategy, and primary key generation strategy, but it doesn't use existing algorithms. The DistSQL engine automatically uses the input expression to create an algorithm for the sharding rules of t_order_detail.

Now there's a primary key generator:

mysql> SHOW SHARDING KEY GENERATORS;
+--------------------------+-----------+-------+
| name                     | type      | props |
+--------------------------+-----------+-------+
| snowflake_key_generator  | snowflake | {}    |
| t_order_detail_snowflake | snowflake | {}    |
+--------------------------+-----------+-------+
2 rows in set (0.00 sec)

Display the sharding algorithm:

mysql> SHOW SHARDING ALGORITHMS;
+--------------------------------+--------+-----------------------------------------------------+
| name                           | type   | props                                               |
+--------------------------------+--------+-----------------------------------------------------+
| database_inline                | inline | algorithm-expression=ds_${user_id % 2}              |
| t_order_inline                 | inline | algorithm-expression=t_order_${order_id % 3}        |
| t_order_item_inline            | inline | algorithm-expression=t_order_item_${order_id % 3}   |
| t_order_detail_database_inline | inline | algorithm-expression=ds_${user_id % 2}              |
| t_order_detail_table_inline    | inline | algorithm-expression=t_order_detail_${order_id % 3} |
+--------------------------------+--------+-----------------------------------------------------+
5 rows in set (0.00 sec)

And finally, the sharding rules:

mysql> SHOW SHARDING TABLE RULES\G;
*************************** 1. row ***************************
                            table: t_order
                actual_data_nodes: ds_${0..1}.t_order_${0..2}
              actual_data_sources:
           database_strategy_type: STANDARD
         database_sharding_column: user_id
 database_sharding_algorithm_type: inline
database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}
              table_strategy_type: STANDARD
            table_sharding_column: order_id
    table_sharding_algorithm_type: inline
   table_sharding_algorithm_props: algorithm-expression=t_order_${order_id % 3}
              key_generate_column: order_id
               key_generator_type: snowflake
              key_generator_props:
*************************** 2. row ***************************
                            table: t_order_item
                actual_data_nodes: ds_${0..1}.t_order_item_${0..2}
              actual_data_sources:
           database_strategy_type: STANDARD
         database_sharding_column: user_id
 database_sharding_algorithm_type: inline
database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}
              table_strategy_type: STANDARD
            table_sharding_column: order_id
    table_sharding_algorithm_type: inline
   table_sharding_algorithm_props: algorithm-expression=t_order_item_${order_id % 3}
              key_generate_column: order_item_id
               key_generator_type: snowflake
              key_generator_props:
*************************** 3. row ***************************
                            table: t_order_detail
                actual_data_nodes: ds_${0..1}.t_order_detail_${0..1}
              actual_data_sources:
           database_strategy_type: STANDARD
         database_sharding_column: user_id
 database_sharding_algorithm_type: inline
database_sharding_algorithm_props: algorithm-expression=ds_${user_id % 2}
              table_strategy_type: STANDARD
            table_sharding_column: order_id
    table_sharding_algorithm_type: inline
   table_sharding_algorithm_props: algorithm-expression=t_order_detail_${order_id % 3}
              key_generate_column: detail_id
               key_generator_type: snowflake
              key_generator_props:
3 rows in set (0.01 sec)

In the CREATE SHARDING TABLE RULE statement, DATABASE_STRATEGY, TABLE_STRATEGY, and KEY_GENERATE_STRATEGY can reuse existing algorithms.

Alternatively, they can be defined quickly through syntax. The difference is that additional algorithm objects are created.

Configuration and verification

Once you have created the configuration verification rules, you can verify them in the following ways.

1. Check node distribution:

DistSQL provides SHOW SHARDING TABLE NODES for checking node distribution, and users can quickly learn the distribution of shard tables:

mysql> SHOW SHARDING TABLE NODES;
+----------------+------------------------------------------------------------------------------------------------------------------------------+
| name           | nodes                                                                                                                        |
+----------------+------------------------------------------------------------------------------------------------------------------------------+
| t_order        | ds_0.t_order_0, ds_0.t_order_1, ds_0.t_order_2, ds_1.t_order_0, ds_1.t_order_1, ds_1.t_order_2                               |
| t_order_item   | ds_0.t_order_item_0, ds_0.t_order_item_1, ds_0.t_order_item_2, ds_1.t_order_item_0, ds_1.t_order_item_1, ds_1.t_order_item_2 |
| t_order_detail | ds_0.t_order_detail_0, ds_0.t_order_detail_1, ds_1.t_order_detail_0, ds_1.t_order_detail_1                                   |
+----------------+------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.01 sec)

mysql> SHOW SHARDING TABLE NODES t_order_item;
+--------------+------------------------------------------------------------------------------------------------------------------------------+
| name         | nodes                                                                                                                        |
+--------------+------------------------------------------------------------------------------------------------------------------------------+
| t_order_item | ds_0.t_order_item_0, ds_0.t_order_item_1, ds_0.t_order_item_2, ds_1.t_order_item_0, ds_1.t_order_item_1, ds_1.t_order_item_2 |
+--------------+------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

You can see that the node distribution of the shard table is consistent with what is described in the requirement.

SQL preview

Previewing SQL is also an easy way to verify configurations. Its syntax is PREVIEW SQL. First, make a query with no shard key, with all routes:

mysql> PREVIEW SELECT * FROM t_order;
+------------------+---------------------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                                  |
+------------------+---------------------------------------------------------------------------------------------+
| ds_0             | SELECT * FROM t_order_0 UNION ALL SELECT * FROM t_order_1 UNION ALL SELECT * FROM t_order_2 |
| ds_1             | SELECT * FROM t_order_0 UNION ALL SELECT * FROM t_order_1 UNION ALL SELECT * FROM t_order_2 |
+------------------+---------------------------------------------------------------------------------------------+
2 rows in set (0.13 sec)

mysql> PREVIEW SELECT * FROM t_order_item;
+------------------+------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                                                 |
+------------------+------------------------------------------------------------------------------------------------------------+
| ds_0             | SELECT * FROM t_order_item_0 UNION ALL SELECT * FROM t_order_item_1 UNION ALL SELECT * FROM t_order_item_2 |
| ds_1             | SELECT * FROM t_order_item_0 UNION ALL SELECT * FROM t_order_item_1 UNION ALL SELECT * FROM t_order_item_2 |
+------------------+------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

Now specify user_id in a query with a single database route:

mysql> PREVIEW SELECT * FROM t_order WHERE user_id = 1;
+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                                                                                        |
+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+
| ds_1             | SELECT * FROM t_order_0 WHERE user_id = 1 UNION ALL SELECT * FROM t_order_1 WHERE user_id = 1 UNION ALL SELECT * FROM t_order_2 WHERE user_id = 1 |
+------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.14 sec)

mysql> PREVIEW SELECT * FROM t_order_item WHERE user_id = 2;
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| data_source_name | actual_sql                                                                                                                                                       |
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ds_0             | SELECT * FROM t_order_item_0 WHERE user_id = 2 UNION ALL SELECT * FROM t_order_item_1 WHERE user_id = 2 UNION ALL SELECT * FROM t_order_item_2 WHERE user_id = 2 |
+------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

Specify user_id and order_id with a single table route:

mysql> PREVIEW SELECT * FROM t_order WHERE user_id = 1 AND order_id = 1;
+------------------+------------------------------------------------------------+
| data_source_name | actual_sql                                                 |
+------------------+------------------------------------------------------------+
| ds_1             | SELECT * FROM t_order_1 WHERE user_id = 1 AND order_id = 1 |
+------------------+------------------------------------------------------------+
1 row in set (0.04 sec)

mysql> PREVIEW SELECT * FROM t_order_item WHERE user_id = 2 AND order_id = 5;
+------------------+-----------------------------------------------------------------+
| data_source_name | actual_sql                                                      |
+------------------+-----------------------------------------------------------------+
| ds_0             | SELECT * FROM t_order_item_2 WHERE user_id = 2 AND order_id = 5 |
+------------------+-----------------------------------------------------------------+
1 row in set (0.01 sec)

Single-table routes scan the fewest shard tables and offer the highest efficiency.

Query unused resources

During system maintenance, algorithms or storage resources that are no longer in use may need to be released, or resources that need to be released may have been referenced and cannot be deleted. DistSQL's SHOW UNUSED RESOURCES command can solve these problems:

mysql> ADD RESOURCE ds_2 (
    ->     HOST=127.0.0.1,
    ->     PORT=3306,
    ->     DB=demo_ds_2,
    ->     USER=root,
    ->     PASSWORD=123456
    -> );
Query OK, 0 rows affected (0.07 sec)

mysql> SHOW UNUSED RESOURCES\G;
*************************** 1. row ***************************
                           name: ds_2
                           type: MySQL
                           host: 127.0.0.1
                           port: 3306
                             db: demo_ds_2
connection_timeout_milliseconds: 30000
      idle_timeout_milliseconds: 60000
      max_lifetime_milliseconds: 2100000
                  max_pool_size: 50
                  min_pool_size: 1
                      read_only: false
               other_attributes: {"dataSourceProperties":{"cacheServerConfiguration":"true","elideSetAutoCommits":"true","useServerPrepStmts":"true","cachePrepStmts":"true","useSSL":"false","rewriteBatchedStatements":"true","cacheResultSetMetadata":"false","useLocalSessionState":"true","maintainTimeStats":"false","prepStmtCacheSize":"200000","tinyInt1isBit":"false","prepStmtCacheSqlLimit":"2048","serverTimezone":"UTC","netTimeoutForStreamingResults":"0","zeroDateTimeBehavior":"round"},"healthCheckProperties":{},"initializationFailTimeout":1,"validationTimeout":5000,"leakDetectionThreshold":0,"poolName":"HikariPool-8","registerMbeans":false,"allowPoolSuspension":false,"autoCommit":true,"isolateInternalQueries":false}
1 row in set (0.03 sec)Query unused primary key generator

DistSQL can also display unused sharding key generators with the SHOW UNUSED SHARDING KEY GENERATORS:

mysql> SHOW SHARDING KEY GENERATORS;
+--------------------------+-----------+-------+
| name                     | type      | props |
+--------------------------+-----------+-------+
| snowflake_key_generator  | snowflake | {}    |
| t_order_detail_snowflake | snowflake | {}    |
+--------------------------+-----------+-------+
2 rows in set (0.00 sec)

mysql> SHOW UNUSED SHARDING KEY GENERATORS;
Empty set (0.01 sec)

mysql> CREATE SHARDING KEY GENERATOR useless (
    -> TYPE(NAME=SNOWFLAKE)
    -> );
Query OK, 0 rows affected (0.04 sec)

mysql> SHOW UNUSED SHARDING KEY GENERATORS;
+---------+-----------+-------+
| name    | type      | props |
+---------+-----------+-------+
| useless | snowflake |       |
+---------+-----------+-------+
1 row in set (0.01 sec)Query unused sharding algorithm

DistSQL can reveal unused sharding algorithms with (you guessed it) the SHOW UNUSED SHARDING ALGORITHMS command:

mysql> SHOW SHARDING ALGORITHMS;
+--------------------------------+--------+-----------------------------------------------------+
| name                           | type   | props                                               |
+--------------------------------+--------+-----------------------------------------------------+
| database_inline                | inline | algorithm-expression=ds_${user_id % 2}              |
| t_order_inline                 | inline | algorithm-expression=t_order_${order_id % 3}        |
| t_order_item_inline            | inline | algorithm-expression=t_order_item_${order_id % 3}   |
| t_order_detail_database_inline | inline | algorithm-expression=ds_${user_id % 2}              |
| t_order_detail_table_inline    | inline | algorithm-expression=t_order_detail_${order_id % 3} |
+--------------------------------+--------+-----------------------------------------------------+
5 rows in set (0.00 sec)

mysql> CREATE SHARDING ALGORITHM useless (
    -> TYPE(NAME=INLINE,PROPERTIES("algorithm-expression"="ds_${user_id % 2}"))
    -> );
Query OK, 0 rows affected (0.04 sec)

mysql> SHOW UNUSED SHARDING ALGORITHMS;
+---------+--------+----------------------------------------+
| name    | type   | props                                  |
+---------+--------+----------------------------------------+
| useless | inline | algorithm-expression=ds_${user_id % 2} |
+---------+--------+----------------------------------------+
1 row in set (0.00 sec)Query rules that use the target storage resources

You can also see used resources within rules with SHOW RULES USED RESOURCE. All rules that use a resource can be queried, not limited to the sharding rule.

mysql> DROP RESOURCE ds_0;
ERROR 1101 (C1101): Resource [ds_0] is still used by [ShardingRule].

mysql> SHOW RULES USED RESOURCE ds_0;
+----------+----------------+
| type     | name           |
+----------+----------------+
| sharding | t_order        |
| sharding | t_order_item   |
| sharding | t_order_detail |
+----------+----------------+
3 rows in set (0.00 sec)Query sharding rules that use the target primary key generator

You can find sharding rules using a key generator with SHOW SHARDING TABLE RULES USED KEY GENERATOR:

mysql> SHOW SHARDING KEY GENERATORS;
+--------------------------+-----------+-------+
| name                     | type      | props |
+--------------------------+-----------+-------+
| snowflake_key_generator  | snowflake | {}    |
| t_order_detail_snowflake | snowflake | {}    |
| useless                  | snowflake | {}    |
+--------------------------+-----------+-------+
3 rows in set (0.00 sec)

mysql> DROP SHARDING KEY GENERATOR snowflake_key_generator;
ERROR 1121 (C1121): Sharding key generator `[snowflake_key_generator]` in database `sharding_db` are still in used.

mysql> SHOW SHARDING TABLE RULES USED KEY GENERATOR snowflake_key_generator;
+-------+--------------+
| type  | name         |
+-------+--------------+
| table | t_order      |
| table | t_order_item |
+-------+--------------+
2 rows in set (0.00 sec)Query sharding rules that use the target algorithm

Show sharding rules using a target algorithm with SHOW SHARDING TABLE RULES USED ALGORITHM:

mysql> SHOW SHARDING ALGORITHMS;
+--------------------------------+--------+-----------------------------------------------------+
| name                           | type   | props                                               |
+--------------------------------+--------+-----------------------------------------------------+
| database_inline                | inline | algorithm-expression=ds_${user_id % 2}              |
| t_order_inline                 | inline | algorithm-expression=t_order_${order_id % 3}        |
| t_order_item_inline            | inline | algorithm-expression=t_order_item_${order_id % 3}   |
| t_order_detail_database_inline | inline | algorithm-expression=ds_${user_id % 2}              |
| t_order_detail_table_inline    | inline | algorithm-expression=t_order_detail_${order_id % 3} |
| useless                        | inline | algorithm-expression=ds_${user_id % 2}              |
+--------------------------------+--------+-----------------------------------------------------+
6 rows in set (0.00 sec)

mysql> DROP SHARDING ALGORITHM t_order_detail_table_inline;
ERROR 1116 (C1116): Sharding algorithms `[t_order_detail_table_inline]` in database `sharding_db` are still in used.

mysql> SHOW SHARDING TABLE RULES USED ALGORITHM t_order_detail_table_inline;
+-------+----------------+
| type  | name           |
+-------+----------------+
| table | t_order_detail |
+-------+----------------+
1 row in set (0.00 sec)Make sharding better

DistSQL provides a flexible syntax to help simplify operations. In addition to the INLINE algorithm, DistSQL supports standard sharding, compound sharding, HINT sharding, and custom sharding algorithms.

If you have any questions or suggestions about Apache ShardingSphere, please feel free to post them on ShardingSphereGitHub.

Take a look at a data sharding scenario in which DistSQL's flexibility allows you to create a distributed database.

Image by:

Opensource.com

Databases What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Install JDBC on Linux in 3 steps

Fri, 09/23/2022 - 15:00
Install JDBC on Linux in 3 steps Seth Kenlon Fri, 09/23/2022 - 03:00

When you write an application, it's common to require data storage. Sometimes you're storing assets your application needs to function, and other times you're storing user data, including preferences and save data. One way to store data is in a database, and in order to communicate between your code and a database, you need a database binding or connector for your language. For Java, a common database connector is JDBC (Java database connectivity.)

1. Install Java

Of course, to develop with Java you must also have Java installed. I recommend SDKman for Linux, macOS, and WSL or Cygwin. For Windows, you can download OpenJDK from developers.redhat.com.

More on Java What is enterprise Java programming? Red Hat build of OpenJDK Java cheat sheet Free online course: Developing cloud-native applications with microservices Fresh Java articles 2. Install JDBC with Maven

JDBC is an API, imported into your code with the statement import java.sql.*, but for it to be useful you must have a database driver and a database installed for it to interact with. The database driver you use and the database you want to communicate with must match: to interact with MySQL, you need a MySQL driver, to interact with SQLite3, you must have the SQLite3 driver, and so on.

For this article, I use PostgreSQL, but all the major databases, including MariaDB and SQLite3, have JDBC drivers.

You can download JDBC for PostgreSQL from jdbc.postgresql.org. I use Maven to manage Java dependencies, so I include it in pom.xml (adjusting the version number for what's current on Maven Central):

>
    >org.postgresql>
    >postgresql>
    >42.5.0>
>3. Install the database

You have to install the database you want to connect to through JDBC. There are several very good open source databases, but I had to choose one for this article, so I chose PostgreSQL.

To install PostgreSQL on Linux, use your software repository. On Fedora, CentOS, Mageia, and similar:

$ sudo dnf install postgresql postgresql-server

On Debian, Linux Mint, Elementary, and similar:

$ sudo apt install postgresql postgresql-contribDatabase connectivity

If you're not using PostgreSQL, the same general process applies:

  1. Install Java.

  2. Find the JDBC driver for your database of choice and include it in your pom.xml file.

  3. Install the database (server and client) on your development OS.

Three steps and you're ready to start writing code.

Install Java, install JDBC with Maven, and install the database. Then you're ready to interact with databases in your Java code.

Image by:

Pixabay. CC0.

Java Linux Databases What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Pages