Open-source News

AMD Revises Work On GPU Workload Hinting For Linux

Phoronix - Sat, 10/01/2022 - 20:39
AMD's Linux graphics driver engineers this week sent out their latest patch series working on workload hints that are used for dynamically tuning the power profile of AMD GPU SoCs based upon the specified workload indicator...

Open-Source Radeon Vulkan Driver Lands APU Fix For Red Dead Redemption 2

Phoronix - Sat, 10/01/2022 - 19:20
For those with an AMD APU system like the Steam Deck and using the Mesa Radeon Vulkan driver "RADV" and wanting to enjoy the popular game Red Dead Redemption 2, an important fix has been merged for Mesa 22.3...

Blumenkrantz Flushes 17.1k Lines Of Old Mesa Code

Phoronix - Sat, 10/01/2022 - 17:38
Well known Zink developer Mike Blumenkrantz, working for Valve on improving Mesa's OpenGL-on-Vulkan driver, has kicked off October by removing a lot of old Mesa code...

AMD Ryzen 7000 Series, Linux 6.0, MGLRU, Rust & IO_uring Made For An Exciting September

Phoronix - Sat, 10/01/2022 - 17:29
It was a very exciting September with the launch of the AMD Ryzen 7000 series "Zen 4" processors, Intel revealing a lot more about Arc Graphics, Linux 6.0 getting buttoned up while feature work toward Linux 6.1 accelerated, ongoing exciting kernel work around MGLRU / IO_uring / RT / etc, and other software releases like GNOME 43 and LLVM 15 all made for an eventful month...

What's new with Awk?

opensource.com - Sat, 10/01/2022 - 15:00
What's new with Awk? Jim Hall Sat, 10/01/2022 - 03:00

Awk is a powerful scripting tool that makes it easy to process text. Awk scripts use a pattern-action syntax, where Awk performs an action for every line in a file that matches a pattern. This provides a flexible yet powerful scripting language to deal with text. For example, the one-line Awk script /error/ {print $1, $2, $3} will print the first three space-delimited fields for any line that contains the word error.

While we also have the GNU variant of Awk, called Gawk, the original Awk remains under development. Recently, Brian Kernighan started a project to add Unicode support to Awk. I met with Brian to ask about the origins of Awk and his recent development work on Awk.

Jim Hall: Awk is a great tool to parse and process text. How did it start?

Brian Kernighan: The most direct influence was a tool that Marc Rochkind developed while working on the Programmer's Workbench system at Bell Labs. As I remember it now, Marc's program took a list of regular expressions and created a C program that would read an input file. Whenever the program found a match for one of the regular expressions, it printed the matching line. It was designed for creating error checking to run over log files from telephone operations data. It was such a neat idea—Awk is just a generalization.

Jim: AWK stands for the three of you who created it: Al Aho, Peter Weinberger, and Brian Kernighan. How did the three of you design and create Awk?

Brian: Al was interested in regular expressions and had recently implemented egrep, which provided a very efficient lazy-evaluation technique for a much bigger class of regular expressions than what grep provided. That gave us a syntax and working code.

Peter had been interested in databases, and as part of that he had some interest in report generation, like the RPG language that IBM provided. And I had been trying to figure out some kind of editing system that made it possible to handle strings and numbers with more or less equal ease.

We explored designs, but not for a long time. I think Al may have provided the basic pattern-action paradigm, but that was implicit in a variety of existing tools, like grep, the stream editor sed, and in the language tools YACC and Lex that we used for implementation. Naturally, the action language had to be C-like.

Jim: How was Awk first used at Bell Labs? When was Awk first adopted into Unix?

Brian: Awk was created in 1977, so it was part of 7th-edition Unix, which I think appeared in about 1979. I wouldn't say it was adopted, so much as it was just another program included because it was there. People picked it up very quickly, and we soon had users all over the Labs. People wrote much bigger programs than we had ever anticipated, too, even tens of thousands of lines, which was amazing. But for some kinds of applications, the language was a good match.

Jim: Has Awk changed over the years, or is Awk today more or less the same Awk from 1977?

Brian: Overall, it's been pretty stable, but there have been a fair number of small things, mostly to keep up with at least the core parts of Gawk. Examples include things like functions to do case conversion, shorthands for some kinds of regular expressions, or special filenames like /dev/stderr. Internally, there's been a lot of work to replace fixed-size arrays with arrays that grow. Arnold Robbins, who maintains Gawk, has also been incredibly helpful with Awk, providing good advice, testing, code, and help with Git.

Jim: You're currently adding Unicode support to Awk. This is one of those projects that seems obvious when you hear it, because Unicode is everywhere, but not every program supports it yet. Tell us about your project to add Unicode to Awk.

Brian: It's been sort of embarrassing for a while now that Awk only handled 8-bit input, though in fairness it predates Unicode by 10 or 20 years. Gawk, the GNU version, has handled Unicode properly for quite a while, so it's good to be up to date and compatible.

More Linux resources Linux commands cheat sheet Advanced Linux commands cheat sheet Free online course: RHEL technical overview Linux networking cheat sheet SELinux cheat sheet Linux common commands cheat sheet What are Linux containers? Our latest Linux articles

Jim: How big of a project is adding Unicode support? Did this require many changes to the source code?

Brian: I haven't counted, but it's probably 200 or 300 lines, primarily concentrated in either the regular expression recognizer or in the various built-in functions that have to operate in characters, not bytes, for Unicode input.

Jim: How far along are you in adding Unicode to Awk?

Brian: There's a branch of the code at GitHub that's pretty up to date. It's been tested, but there's always room for more testing.

One thing to mention: It handles UTF-8 input and output, but for Unicode code points, which are not the same thing as Unicode graphemes. This distinction is important but technically very complicated, at least as I understand it. As a simple example, a letter with an accent could be represented as two code points (letter and accent) or as a single character (grapheme). Doing this right, whatever that means, is very hard.

Jim: In a Computerphile video, you mention adding support for comma-separated values (CSV) parsing to Awk. How is that project going?

Brian: While I had my hands in the code again, I did add support for CSV input, since that's another bit of the language that was always clunky. I haven't done anything for CSV output, since that's easy to do with a couple of short functions, but maybe that should be revisited.

Jim: What kinds of things do you use Awk for in your day-to-day work?

Brian: Everything. Pretty much anything that fiddles text is a target for Awk. Certainly, the Awk program I use most is a simple one to make all lines in a text document the same length. I probably used it 100 times while writing answers to your questions.

Jim: What's the coolest (or most unusual) thing you have used Awk to do?

Brian: A long time ago, I wrote a C++ program that converted Awk programs into C++ that looked as close to Awk as I could manage, by doing things like overloading brackets for associative arrays. It was never used, but it was a fun exercise.

Further reading

Brian Kernighan discusses the scripting tool Awk, from its creation to current work on Unicode support.

Image by:

Jonas Leupe on Unsplash

Programming What to read next Talking digital with Brian Kernighan Learn awk by coding a "guess the number" game This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License. Register or Login to post a comment.

Google Announces Lyra V2 Low Bit-Rate Voice Codec

Phoronix - Sat, 10/01/2022 - 04:11
Last year Google announced the Lyra voice codec for low bit-rates that combined with the open AV1 codec could lead to voice chats on 56kbps connections. Lyra makes use of machine learning and other techniques for extremely low bit-rate speech compression that can function at 3kbps. Google last year open-sourced the Lyra code while today they announced the availability of Lyra V2...

Pages