"Argument list too long": Beyond Arguments and Limitations

Four approaches to getting around argument length limitations on the command line.

At some point during your career as a Linux user, you may have come across the following error:

[user@localhost directory]$ mv * ../directory2
bash: /bin/mv: Argument list too long

The "Argument list too long" error, which occurs anytime a user feeds too many arguments to a single command, leaves the user to fend for oneself, since all regular system commands (ls *, cp *, rm *, etc...) are subject to the same limitation. This article will focus on identifying four different workaround solutions to this problem, each method using varying degrees of complexity to solve different potential problems. The solutions are presented below in order of simplicity, following the logical principle of Occam's Razor: If you have two equally likely solutions to a problem, pick the simplest.

Method #1: Manually split the command line arguments into smaller bunches.

Example 1

[user@localhost directory]$ mv [a-l]* ../directory2
[user@localhost directory]$ mv [m-z]* ../directory2

This method is the most basic of the four: it simply involves resubmitting the original command with fewer arguments, in the hope that this will solve the problem. Although this method may work as a quick fix, it is far from being the ideal solution. It works best if you have a list of files whose names are evenly distributed across the alphabet. This allows you to establish consistent divisions, making the chore slightly easier to complete. However, this method is a poor choice for handling very large quantities of files, since it involves resubmitting many commands and a good deal of guesswork.

Method #2: Use the find command.

Example 2

[user@localhost directory]$ find $directory -type f -name '*' -exec mv
{} $directory2/. \;

Method #2 involves filtering the list of files through the find command, instructing it to properly handle each file based on a specified set of command-line parameters. Due to the built-in flexibility of the find command, this workaround is easy to use, successful and quite popular. It allows you to selectively work with subsets of files based on their name patterns, date stamps, permissions and even inode numbers. In addition, and perhaps most importantly, you can complete the entire task with a single command.

The main drawback to this method is the length of time required to complete the process. Unlike Method #1, where groups of files get processed as a unit, this procedure actually inspects the individual properties of each file before performing the designated operation. The overhead involved can be quite significant, and moving lots of files individually may take a long time.

Method #3: Create a function. *

Example 3a

function large_mv ()
{       while read line1; do
                mv directory/$line1 ../directory2
        done
}
ls -1 directory/ | large_mv

Although writing a shell function does involve a certain level of complexity, I find that this method allows for a greater degree of flexibility and control than either Method #1 or #2. The short function given in Example 3a simply mimics the functionality of the find command given in Example 2: it deals with each file individually, processing them one by one. However, by writing a function you also gain the ability to perform an unlimited number of actions per file still using a single command:

Example 3b

function larger_mv ()
{       while read line1; do
                md5sum directory/$line1 >>  ~/md5sums
                ls -l directory/$line1 >> ~/backup_list
                mv directory/$line1 ../directory2
        done
}
ls -1 directory/ | larger_mv

Example 3b demonstrates how you easily can get an md5sum and a backup listing of each file before moving it.

Unfortunately, since this method also requires that each file be dealt with individually, it will involve a delay similar to that of Method #2. From experience I have found that Method #2 is a little faster than the function given in Example 3a, so Method #3 should be used only in cases where the extra functionality is required.

Method #4: Recompile the Linux kernel. **

This last method requires a word of caution, as it is by far the most aggressive solution to the problem. It is presented here for the sake of thoroughness, since it is a valid method of getting around the problem. However, please be advised that due to the advanced nature of the solution, only experienced Linux users should attempt this hack. In addition, make sure to thoroughly test the final result in your environment before implementing it permanently.

One of the advantages of using an open-source kernel is that you are able to examine exactly what it is configured to do and modify its parameters to suit the individual needs of your system. Method #4 involves manually increasing the number of pages that are allocated within the kernel for command-line arguments. If you look at the include/linux/binfmts.h file, you will find the following near the top:

/*
 * MAX_ARG_PAGES defines the number of pages allocated for   arguments
 * and envelope for the new program. 32 should suffice, this gives
 * a maximum env+arg of 128kB w/4KB pages!
 */
#define MAX_ARG_PAGES 32

In order to increase the amount of memory dedicated to the command-line arguments, you simply need to provide the MAX_ARG_PAGES value with a higher number. Once this edit is saved, simply recompile, install and reboot into the new kernel as you would do normally.

On my own test system I managed to solve all my problems by raising this value to 64. After extensive testing, I have not experienced a single problem since the switch. This is entirely expected since even with MAX_ARG_PAGES set to 64, the longest possible command line I could produce would only occupy 256KB of system memory--not very much by today's system hardware standards.

The advantages of Method #4 are clear. You are now able to simply run the command as you would normally, and it completes successfully. The disadvantages are equally clear. If you raise the amount of memory available to the command line beyond the amount of available system memory, you can create a D.O.S. attack on your own system and cause it to crash. On multiuser systems in particular, even a small increase can have a significant impact because every user is then allocated the additional memory. Therefore always test extensively in your own environment, as this is the safest way to determine if Method #4 is a viable option for you.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Argument list too long.

EB's picture

If you want to scp a very large number of files from another machine and getting the list too long error, you can try the following script. Slow, but steady

#!/bin/bash

# Sometimes the file list is too long for scp. Then just do this one by one
# Quite slow, but works

echo "IMPORTANT: be sure you have ssh access without need for password "
# first generate a list of files.
/bin/rm SCPFILELIST
ssh remotename@remotemachine.com ls -1 /fullpath/DIRECTORY/ > SCPFILELIST

# In next line, replace DPtime in the next with your pattern. No wildcard *

for i in `cat SCPFILELIST | grep DPtime`
do
if [ -f $i ]
then

# this part is if you keep losing connection and want to avoid having
# to start from the first file each time. Be sure to check that there
# are no files that are partially downloaded because connection was lost

echo "$i exists - skipping"

else
echo "copying $i"
scp remotename@remotemachine.com:/fullpath/DIRECTORY/$i ./
fi
done

Why not just copy the

Words13100's picture

Why not just copy the containing directory instead of the files, then rename/move the directory?

variable argument length

Anonymous's picture

2.6.23 and newer kernels have a solution for this "argument list too long" problem:

Variable argument length

mdfind error

johndimartino's picture

I'm trying this command (on OS X)

find $(mdfind -s SmallAllRecovered -onlyin /Volumes/Untitled\ 1/All\ Recovered/) -type f | xargs mv /Volumes/Untitled\ 1/Small\ Recovered/

And I'm still getting the Argument List Too Long error. Anybody have a workaround?

Thanks

Use a file and narrow down.

Ben Griffin's picture

for eg a large directory of files, one can also just do a general ls and then use grep to cut it down...
eg

ls huge_directory > /tmp/list$$

[...] processing

rm /tmp/list$$

Of course, for 'funny' filenames, you need to do a little more than that.

How to use xargs command to solve argument list too long

Terry Chu 's picture

when i try to run a file it display the error message as:

./RUN.sh[95]: /usr/bin/ls: arg list too long
./Run.sh[102]: /usr/bin/ls: arg list too long

ls | grep " " | perl -p -i -e 'chomp; $a=$_; s/ //g; `mv \"$a\" \"$_\"`;'

i get the solution through Internet as they suggested to use "xargs" command:

So i added the xargs command into my codes show as below:

ls | grep " " |xargs perl -p -i -e 'chomp; $a=$_; s/ //g; `mv \"$a\" \"$_\"`;'

&&

xargs ls | grep " " | perl -p -i -e 'chomp; $a=$_; s/ //g; `mv \"$a\" \"$_\"`;'

I try both method and Its fail to run
./Run.sh[34]: xrgs: not found
./Run.sh[35]: xrgs: not found
This is my first time to use xargs command. Can anyone suggest me a better idea or correct my mistake of using xargs.

Thanks

as vi fanatic, my way to

florian's picture

as vi fanatic, my way to resolve this was the following:

ls /tmp > ls_tmp
vim ls_tmp
:v/i_want_this_in_the_name/d #delete all lines not containing i_want_this..
:%s/^/rm \/tmp\// # add rm at the beginning of each line
:wq
sudo bash ls_tmp

Yay. My favoured take on the

another vi enthusiast's picture

Yay. My favoured take on the issue, but why quit vi for it ?-)

After "vi ls_tmp", run ":r!ls tmp/", ":v/pattern//d", ":%s/^/rm tmp\//", ":w", ":!. ls_tmp", achieves the same and i can stay w/i vi.

Just use tar

Starbuck's picture

I appreciate this article but it left out another viable option. If the source is at /x/y/source and the target is /a/b/target:

cd /x/y
tar -cf AllFiles source/*
cd /a/b
tar -xf /x/y/AllFiles
mv source target

It's not pretty but it's dirt simple using common commands and more elegant than some of the options.

HTH

Still doesn't work for large numbers..

Anonymous's picture

I have 120,000 files to move, and the tar command also gave me "/usr/bin/tar: Argument list too long". Writing a script was more functional...

Re: Just use tar

borjonx's picture

thanks for posting your tar command. that was incredibly simple, portable & extensible.

Thank You

Julian Alimin's picture

I Really Needed This
I Had to copy 20000 files and really didn't know how.
Thanks

Method #4 : works very well, BUT MAYBE VERY DANGEROUS

Anonymous's picture

I used Method #4 and it worked very well, until the interior of /root was deleted!!!

I reinstalled the server, re-modified the kernel, and only ran my "remove script"... after two weeks BAM.. again /root was deleted.

It looks like it's creating a buffer overflow at some point... or it may be my script but I don't think so, it's a simple: find "specific folder" -exec rm {} \;

Use with caution....

Example 1 is wrong

Anonymous's picture

> [user@localhost directory]$ mv [a-l]* ../directory2
> [user@localhost directory]$ mv [m-z]* ../directory2

Should be:

[user@localhost directory]$ mv [a-l]* ../directory2
[user@localhost directory]$ mv [^a-l]* ../directory2

to avoid missing numbers, etc.

No luck recompiling

AlanW's picture

Tried #4 - change MAX_ARG_PAGES and recompile but it makes no difference. I noticed the MAX_ARG_PAGES is also hard-coded in include/linux/limits.h so changed it there too and recompiled.
Still no change. Does this mean everything else (mv, cp...etc.) has to be recompiled too? The author mentions making that one change and recompiling...any hints?

In my case this work fine:

Shadow's picture

In my case this work fine:

ls -1 -Q *.log | xargs rm -f

hello everybody i am have the

Anonymous's picture

hello everybody i am have the issue of arguement liss too long
and now i can even do ls
can some body tell how to recover from this
ls -1 -Q *.log | xargs rm -f
this command i use

This is not a solution. ls

Anonymous's picture

This is not a solution. ls -1 blah*blah suffers from the same problem. The blah*blah expands to something too long and it produces the same error. Only "ls -1" WITH NO ARGUMENTS can be used... it can list all the files, no matter how many, only because it is the command input length that is the limitation, not the output.

This leads to the point that method 3 only works as written if you want to process EVERY file in a directory.

recompiling and the find command are the only real solutions. The find command can of course still be used with shell functions or xargs to do more than one thing, but ANY solution involving ls will have big limitations.

indeed... one should NEVER

Anonymous's picture

indeed... one should NEVER parse, pipe, grep, capture, read, or loop over the output of 'ls'. Unlike popular belief, 'ls' is NOT designed to enumerate files or parse their statistics. Using 'ls' for this is dangerous (word splitting) and there's always a better way; eg. globs: files=(*) and xargs is a broken tool if you do not use the -0 option. Use ''find -exec'' or ''for file in *'' instead if at all possible. Two xargs 'bugs' in one: xargs rm <<< "Don't cry.mp3".

using scp

Anonymous's picture

I wanted to scp the other way, but this may help someone out there, if you know how to get files from a remote machine via scp this way, please post.

write a small shell script called "reverse-scp":

#!/usr/bin/sh
# "reverse-scp" copies with the destination first, unlike scp
dest=$1
shift
scp $* $dest

Then you can do:

find . -name '*' | xargs -l100 reverse-scp dest

for is your foriend

Anonymous's picture

for i in *; do mv ${i} dir2; done

Doing some timings between fi

Anonymous's picture

Doing some timings between find, xargs and for 7400 files ("Standard" P3-700 PC):

$ time ls | xargs -i{} mv {} ..
real 0m38.591s
user 0m6.520s
sys 0m30.260s

$ time find . -type f -exec mv {} .. \;
real 0m38.969s
user 0m6.020s
sys 0m31.310s

$ time for i in *; do mv $i ..; done
real 0m49.424s
user 0m8.270s
sys 0m40.260s

Find must do some clumping like xargs would do, straight mv on each file takes a bit longer.

You can tweak how many arguments xargs passes to the command, I got better times with around 1000 files per mv.

$ time ls | xargs -n1000 -i{} mv {} ..
real 0m37.783s
user 0m6.440s
sys 0m29.780s

--But this is all within a second or two (between xargs and find). So either would be better than a shell for loop.

Re: Method #3: Create a function

Anonymous's picture

[www.alexx.net/~roges/] I'm sure there are better ways but I got round this with a perl script.

#!/usr/bin/perl
# crashoverride ver 1.0 20041001 roges ad alexx dot net [non sic]
use strict;
my $file_ending;
my $cmd;
my $count;

sub footer
{
my $this_year=`date +%Y`; chop($this_year);
print "Copyright 2003-$this_year Cahiravahilla publications
";
}

sub help
{
print "Usage: crashoverride $bash_command $files_regexp [file_ending]
This will get round the "Argument list too long" problem";
&footer;
exit(1);
}

if( ($ARGV[0] =~ /^-+h/i) || (!$ARGV[0]) )
{
&help;
}
else
{
$cmd = $ARGV[0];
}

if($ARGV[2])
{
$file_ending = $ARGV[2];
}
else
{
$file_ending = '.*';
}

my $pwd = `pwd`; chomp($pwd);
my $files = $ARGV[1];
my @dir_list = `ls -la`;
my $safe;

foreach my $file (@dir_list) {
if($file =~ /$file_ending$/i)
{
$file =~ s/.*s(S*.$file_ending)$/$1/i;
if($file !~ /^$files$/) { next; } #don't want the others
if($cmd=~m/^mv/ || $cmd=~m/^cp/)
{
$safe = '-i';
}
`$cmd $safe $pwd/$file`;
$count++;
}
}
print "
$count files $cmd"."-ed
";
exit(0);

Re:

Anonymous's picture

ls -1 > xx

for i in `cat xx`; do gzip $i; done

Re:

Anonymous's picture

ls -1 | awk '{print "mv " $1 " dest-dir"} | sh

Summary

Anonymous's picture

In other words...

Provided we use GNU bash, GNU find and GNU xargs, and we want to move stuff from one directory to another, we can do it like this:

# big_mv [-m <glob_mask>] <source_dir> <dest_dir>
big_mv() {
unset big_m; [ "$1" = -m -a -n "$2" ] && { big_m=$2; shift 2; }
[ -d "$1" -a -d "$2" ] || { echo need two directories >&2; exit 1; }
( cd -- "$1" &&
find . -mindepth 1 -maxdepth 1 ${big_m+-name "$big_m"} -print0 |
xargs -0 mv --target-directory="$2" )
}

Comments: 'cd -- && find .' instead of 'find "$1"' enables you to have directory names starting with special characters. Putting cd in a ( subshell ) makes it not affect the current environment. '-mindepth 1' removes the '.' from the results. $1 and $2 are quoted properly everywhere to handle filenames with whitespace.

//traal

No need for shell functions

Anonymous's picture

Actually method #3 can be made shorter:

ls directory/ | while read f; do mv $f target/; done

I would even argue that it is simpler than #2, especially if you do not want find to recurse into subdirectories.
But watch out for file names with spaces/newlines/other nastyness; find -print0 ... | xargs -0 mv --target-dir ... would be a more robust and efficient solution.

Re: xargs

Anonymous's picture

It is not technically correct to say that the

mv command is "only loaded once" when run

under xargs.

xargs is exec'd once, it builds command lines

(including the mv command, in these examples) as

long as the local system allows (that's xargs

job, translate lines of input into arguments

according to local system limits) and executes

each of those lines. Thus it might be able to

fit 500 filenames on one line (with the mv

command, any mv option switches, and the target

directory) exec that and fit 200 on the next one

(some longer filenames perhaps) or 1000 on the

next, etc.

The examples involving -i where misguided in that

this (while it enables the resplaces of {} with

an argument) implies the "-l 1" ("dash ell, one")

option --- meaning that the mv command IS exec'd

for each of the input lines/filenames.

The one that works efficienly is using the

obscure GNU mv --target-dir= switch which

allows us to construct a mv command with all

of the "object" arguments on the tail, and the

required "target" (normally the last arg) in

"the head" (part of the non replaced text).

BTW it is often useful to use the GNU xargs

-0 options with the GNU find -print0 (those are

both "zeros") in orger to account for degenerate

filenames (with embedded spaces, tabs or newlines).

This will cause find to produce

ASCIIZ (NUL terminated) filenames, and cause

xargs to parse it's input as such (NUL term.) ---

which will resolve most of the parsing problems

for the exec'd command (because a shell is not

invoked to parse these command lines, xargs it

exec()'ing them directly).

Thus the correct incantation is:

find . -print0 | xargs -0 mv --target-dir=$TARG

[BTW: the --target-dir= switch is also available

in GNU cp, and the -0 (or similar) option is

available in GNU cpio, GNU tar (--null) and

a few others.]

... Thanks for playing!

(Jim Dennis, Linux Gazette Answer Guy and

Guerilla Shell Scripting Master)

Why not use the xargs command?

SysKoll's picture

An easy fix would be the xargs commands. It accepts a -n MaxArg flag which specifies the max number of arguments to provide. Your ls -1 example becomes:


ls -1 | xargs -n 10 -i mv {} target_dir

I found that xargs is generally faster than the find command.

-- SysKoll

Re: Why not use the xargs command?

Anonymous's picture

The -1 on ls is unnecessary, any ls performed to anything other than STDOUT is always on a 1 file per line basis.

And then what if someone deci

Anonymous's picture

And then what if someone decides to run the script interactively? Come on - it's not hurting anything performance wise to just be paranoid and include the '-1'.

Re: Method #2 (find) and other methods

Anonymous's picture

Method #2: Use the find command.

Example 2

[user@localhost directory]$ find $directory -type f -name '*' -exec mv
{} $directory2/. ;

....
The main drawback to this method is the length of time required to complete the process. Unlike Method #1, where groups of files get processed as a unit, this procedure actually inspects the individual properties of each file before performing the designated operation. The overhead involved can be quite significant, and moving lots of files individually may take a long time.

Actually, this command can be improved thus:

[user@localhost directory]$ find $directory -type f | xargs -i mv
{} $directory2/.

The first example requires reloading and rerunning mv for each and every file; with xargs, the mv command binary is only loaded once into memory.
The first example also had a useless
-name '*' specification in it.

The find method also has the drawback of descending into directories if all you want is files from the first level.

However, with this example, using $directory/* may be too long, but using $directory/ is not since it is not expanded by the shell. Also, since all files are being moved (assuming only files are in the directory), this example lends itself to moving the entire directory, either with tar/cpio and gzip, or perhaps just with mv:
[user@localhost directory]$ mv $directory $directory2

Also, GNU has a little known option, so you could do this (probably best):

[user@localhost directory]$ find $directory -type f |
xargs mv --target-directory=$directory2

This would probably reduce the time substantially, since mv is only loaded into memory once, and is run few times compared to running it for every single file. It does, however, still descend into the $directory subdirectory tree looking for files to copy into $directory2.

Re: Method #2 (find) and other methods

charlesnadeau's picture

I tried:

find /backup/hourly.4 -exec mv {} /backup/hourly.2/.;

and got this error message back:

find: missing argument to `-exec'

Why? I didn't include "-type f" because I want to move the entire directory structure.
Thanks!

Charles
http://radio.weblogs.com/0111823/

Re: Method #2 (find) and other methods

Anonymous's picture

Try

find /backup/hours.4 -exec mv {} /backup/hourly.2/. ;

The semi-colon needs to be escaped so that the shell does not grab it.

Andy

Re: Method #2 (find) and other methods

Anonymous's picture

You need to escape the semicolon at the end of the command:

find /backup/hourly.4 -exec mv {} /backup/hourly.2/. ;

Re: Method #2 (find) and other methods

Anonymous's picture

Use find $directory -type f -maxdepth 1 to avoid descending into subdirectories.

Re: Method #2 (find) and other methods

Anonymous's picture

Try doing the find like this to get rid of the annoying directory traversal problem...

[user@localhost directory]$ find "$directory" -level 0 -exec mv {} "$directory2"/. ';'

I removed the '-type f' as it gives different results than the original mv * did. The quotes around $directory and $directory2 are only needed if there are spaces or something in the directory names, but I recommend it in any scripts that could have arbitrary names passed in.

Of course, the -level 0 thing would work on any of the xargs variants of the find, as well...

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix