More Working with CSV Files from the Command Line

FAIL (the browser should render some flash content, not this).

How to extract and manipulate CSV data using the command line.

Download in .ogv format

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Process Substitution

mattcen's picture

I've never seen this 'Process Substitution' done before! I just had a quick look at it in the Bash man page. It's something that would've been very used to me in a few occasions over the past few months. Perhaps I haven't heard of it because it's not part of the Posix standard?

Very cool! Thanks for showing us.

Regards,
Matt.

--
Regards,
Matthew Cengia

Useful even if not often used

Mitch Frazier's picture

Not something I use very often, it seems a bit "involved" so it doesn't always occur to me at first, but it is useful.

According to wikipedia ksh, zsh, and rc also support process substitution.

Mitch Frazier is an Associate Editor for Linux Journal.

text

Anonymous Blub's picture

these videos are great. i'd love to be able to follow up and learn for myself.

is there a chance you could post the session text so we could copy/paste/play?

thanks, AB

Good Idea

Mitch Frazier's picture

Here's a shell script that contains the code from this video and the previous one:

#!/bin/bash

# Extract emails:
# Could also do: grep -i -o '[^,]*@[^,]*' stuff.csv
cut -f 3 -d , stuff.csv
echo

# Use emails to extract lines:
for i in $(cut -f 3 -d , stuff.csv | sort --ignore-case | uniq --ignore-case)
do
        grep -i -F "$i" stuff.csv | head -n 1
done
echo

# Make sure posix mode is OFF so process substitution works "<(...)" syntax.
set +o posix

# Use emails to de-dupe a file and then extract lines:
for i in $(comm -23 \
                <(cut -f 3 -d , stuff.csv | sort --ignore-case | uniq --ignore-case) \
                <(cut -f 3 -d , nostuff.csv | sort --ignore-case | uniq --ignore-case))
do
        grep -i -F "$i" stuff.csv | head -n 1
done

Here's stuff.csv:

1,a,joe@example.com,e
2,b,jim@example.com,f
3,c,sally@example.com,g

And nostuff.csv:

3,c,sally@example.com,g

Mitch Frazier is an Associate Editor for Linux Journal.

the comm command is great.

Anonymous's picture

the comm command is great. set difference on the command line has always been useful to me. thanks for the example!

White Paper
Fabric-Based Computing Enables Optimized Hyperscale Data Centers

Today’s modular x86 servers are compute-centric, designed as a least common denominator to support a wide range of IT workloads. Those generic, virtualized IT workloads have much different resource optimization requirements than hyperscale and cloud applications. They have resulted in a “one size fits all” enterprise IT architecture that is not optimized for a specific set of IT workloads, and especially not emerging hyperscale workloads, such as web applications, big data, and object storage. In this report, you will learn how shifting the focus from traditional compute-centric IT architectures to an innovative disaggregated fabric-based architecture can optimize and scale your data center.

Learn More

Sponsored by AMD

White Paper
Red Hat White Paper: Using an Open Source Framework to Catch the Bad Guy

Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6

Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.

Learn more about catching the bad guy in this free white paper.

Learn More

Sponsored by DLT Solutions