More Working with CSV Files from the Command Line

FAIL (the browser should render some flash content, not this).

How to extract and manipulate CSV data using the command line.

Download in .ogv format

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Process Substitution

mattcen's picture

I've never seen this 'Process Substitution' done before! I just had a quick look at it in the Bash man page. It's something that would've been very used to me in a few occasions over the past few months. Perhaps I haven't heard of it because it's not part of the Posix standard?

Very cool! Thanks for showing us.

Regards,
Matt.

--
Regards,
Matthew Cengia

Useful even if not often used

Mitch Frazier's picture

Not something I use very often, it seems a bit "involved" so it doesn't always occur to me at first, but it is useful.

According to wikipedia ksh, zsh, and rc also support process substitution.

Mitch Frazier is an Associate Editor for Linux Journal.

text

Anonymous Blub's picture

these videos are great. i'd love to be able to follow up and learn for myself.

is there a chance you could post the session text so we could copy/paste/play?

thanks, AB

Good Idea

Mitch Frazier's picture

Here's a shell script that contains the code from this video and the previous one:

#!/bin/bash

# Extract emails:
# Could also do: grep -i -o '[^,]*@[^,]*' stuff.csv
cut -f 3 -d , stuff.csv
echo

# Use emails to extract lines:
for i in $(cut -f 3 -d , stuff.csv | sort --ignore-case | uniq --ignore-case)
do
        grep -i -F "$i" stuff.csv | head -n 1
done
echo

# Make sure posix mode is OFF so process substitution works "<(...)" syntax.
set +o posix

# Use emails to de-dupe a file and then extract lines:
for i in $(comm -23 \
                <(cut -f 3 -d , stuff.csv | sort --ignore-case | uniq --ignore-case) \
                <(cut -f 3 -d , nostuff.csv | sort --ignore-case | uniq --ignore-case))
do
        grep -i -F "$i" stuff.csv | head -n 1
done

Here's stuff.csv:

1,a,joe@example.com,e
2,b,jim@example.com,f
3,c,sally@example.com,g

And nostuff.csv:

3,c,sally@example.com,g

Mitch Frazier is an Associate Editor for Linux Journal.

the comm command is great.

Anonymous's picture

the comm command is great. set difference on the command line has always been useful to me. thanks for the example!

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState