Open-source News

Parsing data with strtok in C

opensource.com - Sat, 04/30/2022 - 15:00
Parsing data with strtok in C Jim Hall Sat, 04/30/2022 - 03:00 Register or Login to like Register or Login to like

Some programs can just process an entire file at once, and other programs need to examine the file line-by-line. In the latter case, you likely need to parse data in each line. Fortunately, the C programming language has a standard C library function to do just that.

The strtok function breaks up a line of data according to "delimiters" that divide each field. It provides a streamlined way to parse data from an input string.

Reading the first token

Suppose your program needs to read a data file, where each line is separated into different fields with a semicolon. For example, one line from the data file might look like this:

102*103;K1.2;K0.5

In this example, store that in a string variable. You might have read this string into memory using any number of methods. Here's the line of code:

char string[] = "102*103;K1.2;K0.5";

Once you have the line in a string, you can use strtok to pull out "tokens." Each token is part of the string, up to the next delimiter. The basic call to strtok looks like this:

#include
char *strtok(char *string, const char *delim);

The first call to strtok reads the string, adds a null (\0) character at the first delimiter, then returns a pointer to the first token. If the string is already empty, strtok returns NULL.

#include
#include

int
main()
{
  char string[] = "102*103;K1.2;K0.5";
  char *token;

  token = strtok(string, ";");

  if (token == NULL) {
    puts("empty string!");
    return 1;
  }

  puts(token);

  return 0;
}

This sample program pulls off the first token in the string, prints it, and exits. If you compile this program and run it, you should see this output:

102*103

102*103 is the first part of the input string, up to the first semicolon. That's the first token in the string.

Note that calling strtok modifies the string you are examining. If you want the original string preserved, make a copy before using strtok.

Programming and development Red Hat Developers Blog Programming cheat sheets Try for free: Red Hat Learning Subscription eBook: An introduction to programming with Bash Bash shell scripting cheat sheet eBook: Modernizing Enterprise Java Reading the rest of the string as tokens

Separating the rest of the string into tokens requires calling strtok multiple times until all tokens are read. After parsing the first token with strtok, any further calls to strtok must use NULL in place of the string variable. The NULL allows strtok to use an internal pointer to the next position in the string.

Modify the sample program to read the rest of the string as tokens. Use a while loop to call strtok multiple times until you get NULL.

#include
#include

int
main()
{
  char string[] = "102*103;K1.2;K0.5";
  char *token;

  token = strtok(string, ";");

  if (token == NULL) {
    puts("empty string!");
    return 1;
  }

  while (token) {
    /* print the token */
    puts(token);

    /* parse the same string again */
    token = strtok(NULL, ";");
  }

  return 0;
}

By adding the while loop, you can parse the rest of the string, one token at a time. If you compile and run this sample program, you should see each token printed on a separate line, like this:

102*103
K1.2
K0.5Multiple delimiters in the input string

Using strtok provides a quick and easy way to break up a string into just the parts you're looking for. You can use strtok to parse all kinds of data, from plain text files to complex data. However, be careful that multiple delimiters next to each other are the same as one delimiter.

For example, if you were reading CSV data (comma-separated values, such as data from a spreadsheet), you might expect a list of four numbers to look like this:

1,2,3,4

But if the third "column" in the data was empty, the CSV might instead look like this:

1,2,,4

This is where you need to be careful with strtok. With strtok, multiple delimiters next to each other are the same as a single delimiter. You can see this by modifying the sample program to call strtok with a comma delimiter:

#include
#include

int
main()
{
  char string[] = "1,2,,4";
  char *token;

  token = strtok(string, ",");

  if (token == NULL) {
    puts("empty string!");
    return 1;
  }

  while (token) {
    puts(token);
    token = strtok(NULL, ",");
  }

  return 0;
}

If you compile and run this new program, you'll see strtok interprets the ,, as a single comma and parses the data as three numbers:

1
2
4

Knowing this limitation in strtok can save you hours of debugging.

Using multiple delimiters in strtok

You might wonder why the strtok function uses a string for the delimiter instead of a single character. That's because strtok can look for different delimiters in the string. For example, a string of text might have spaces and tabs between each word. In this case, you would use each of those "whitespace" characters as delimiters:

#include
#include

int
main()
{
  char string[] = "  hello \t world";
  char *token;

  token = strtok(string, " \t");

  if (token == NULL) {
    puts("empty string");
    return 1;
  }

  while (token) {
    puts(token);
    token = strtok(NULL, " \t");
  }

  return 0;
}

Each call to strtok uses both a space and tab character as the delimiter string, allowing strtok to parse the line correctly into two tokens.

Wrap up

The strtok function is a handy way to read and interpret data from strings. Use it in your next project to simplify how you read data into your program.

The strtok function is a handy way to read and interpret data from strings. Use it in your next project to simplify how you read data into your program.

Image by:

kris krüg

Programming What to read next This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

The Open 3D Foundation Welcomes Microsoft as a Premier Member to Advance the Future of Open Source 3D Development

The Linux Foundation - Sat, 04/30/2022 - 00:14

Microsoft joins over 25 organizations committed to democratizing 3D software development for games and simulations

SAN FRANCISCO – April 29, 2022 – The Open 3D Foundation (O3DF) is proud to welcome Microsoft as a Premier member alongside Adobe, AWS, Huawei, Intel, and Niantic. Microsoft’s participation in the project brings a wealth of knowledge and thought leadership that continues to reinforce how important the industry believes in working to make a high-fidelity and fully-featured open-source 3D engine available to every industry unencumbered by commercial terms. 

Microsoft Principal Group Program Manager Paul Oliver will join the Governing Board of O3DF, supporting the Foundation’s commitment to ensure balanced collaboration and feedback that meets the needs of the Open 3D community. The Governing Board cultivates innovative relationships among stakeholders to drive the Foundation’s strategic direction and its stewardship of 3D visualization and simulation projects. 

“Microsoft’s roots in creativity run deep, and we want to help creators wherever they are, whoever they are, and whatever platform they’re creating for. Having the Linux Foundation create the Open 3D Foundation is a fantastic step towards helping more creators everywhere and we are excited to be a part of it.”

This move builds on Microsoft’s continued commitment to democratizing game development and making its tools and technologies available to game creators worldwide. Last year, the company made its Game Development Kit available to all developers through GitHub. With its new engagement with O3DF, Microsoft is extending a commitment to opening up technology to everyone.

“We are elated to have Microsoft join the Open 3D Foundation as a Premier member,” said Royal O’Brien, Executive Director of O3DF and General Manager of Games and Digital Media at the Linux Foundation. “Having incredible industry veterans like Microsoft contributing and helping drive innovation with the community for 3D engines is a huge benefit to the open-source community and the companies that use it alike.”

A Growing Community

Microsoft is one of 25 member companies since the public announcement of the Open 3D Foundation in July 2021. In November 2021, Open 3D Engine (O3DE) announced its first major release. The 21.11 Release allows simulation developers to create 3D content with the new O3DE Linux editor and engine runtime. This release also added a new Debian package and Windows installer that provides a faster route to getting started with the engine. The O3DE community is very active, averaging up to 2 million line changes and 350-450 commits monthly from 60-100 authors across 41 repos.

Where to See the Open 3D Engine Next

On June 20, the Open 3D Foundation will host Open 3D Connect, a half-day interactive meet-up, co-located with the Linux Foundation’s Open Source Summit North America in Austin, Texas. Learn more here.

Additionally, on October 18-19, the Open 3D Foundation will host its flagship conference, bringing together technology leaders, indie and independent 3D developers, and the academic community to share ideas, discuss hot topics and foster the future of 3D development across a variety of industries and disciplines. For those interested in sponsoring this event, please contact pr@o3d.foundation

Anyone interested in the Open 3D Engine is invited to get involved and connect with the community on Discord.com/invite/o3de and GitHub.com/o3de

About the Open 3D Engine (O3DE) project

The Open 3D Engine (O3DE) is the flagship project managed by the Open 3D Foundation (O3DF). The open-source project is a modular, cross-platform 3D engine built to power anything from AAA games to cinema-quality 3D worlds to high-fidelity simulations. The code is hosted on GitHub under the Apache 2.0 license. To learn more, please visit o3de.org.

About the Open 3D Foundation

Established in July 2021, the mission of the Open 3D Foundation (O3DF) is to make an open-source, fully-featured, high-fidelity, real-time 3D engine for building games and simulations, available to every industry. The Open 3D Foundation is home to the O3DE project. To learn more, please visit o3d.foundation.

About the Linux Foundation

Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation’s projects are critical to the world’s infrastructure including Linux, Kubernetes, Node.js, and more. The Linux Foundation’s methodology focuses on leveraging best practices and addressing the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org.

Media Inquiries:

pr@o3d.foundation

The post The Open 3D Foundation Welcomes Microsoft as a Premier Member to Advance the Future of Open Source 3D Development appeared first on Linux Foundation.

Pages