Playing with ptrace, Part I

Using ptrace allows you to set up system call interception and modification at the user level.
Reading System Call Parameters

By calling ptrace with PTRACE_PEEKUSER as the first argument, we can examine the contents of the USER area where register contents and other information is stored. The kernel stores the contents of registers in this area for the parent process to examine through ptrace.

Let's show this with an example:

#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <linux/user.h>
#include <sys/syscall.h>   /* For SYS_write etc */
int main()
{   pid_t child;
    long orig_eax, eax;
    long params[3];
    int status;
    int insyscall = 0;
    child = fork();
    if(child == 0) {
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        execl("/bin/ls", "ls", NULL);
    }
    else {
       while(1) {
          wait(&status);
          if(WIFEXITED(status))
              break;
          orig_eax = ptrace(PTRACE_PEEKUSER,
                     child, 4 * ORIG_EAX, NULL);
          if(orig_eax == SYS_write) {
             if(insyscall == 0) {
                /* Syscall entry */
                insyscall = 1;
                params[0] = ptrace(PTRACE_PEEKUSER,
                                   child, 4 * EBX,
                                   NULL);
                params[1] = ptrace(PTRACE_PEEKUSER,
                                   child, 4 * ECX,
                                   NULL);
                params[2] = ptrace(PTRACE_PEEKUSER,
                                   child, 4 * EDX,
                                   NULL);
                printf("Write called with "
                       "%ld, %ld, %ld\n",
                       params[0], params[1],
                       params[2]);
                }
          else { /* Syscall exit */
                eax = ptrace(PTRACE_PEEKUSER,
                             child, 4 * EAX, NULL);
                    printf("Write returned "
                           "with %ld\n", eax);
                    insyscall = 0;
                }
            }
            ptrace(PTRACE_SYSCALL,
                   child, NULL, NULL);
        }
    }
    return 0;
}

This program should print an output similar to the following:

ppadala@linux:~/ptrace > ls
a.out        dummy.s      ptrace.txt
libgpm.html  registers.c  syscallparams.c
dummy        ptrace.html  simple.c
ppadala@linux:~/ptrace > ./a.out
Write called with 1, 1075154944, 48
a.out        dummy.s      ptrace.txt
Write returned with 48
Write called with 1, 1075154944, 59
libgpm.html  registers.c  syscallparams.c
Write returned with 59
Write called with 1, 1075154944, 30
dummy        ptrace.html  simple.c
Write returned with 30
Here we are tracing the write system calls, and ls makes three write system calls. The call to ptrace, with a first argument of PTRACE_SYSCALL, makes the kernel stop the child process whenever a system call entry or exit is made. It's equivalent to doing a PTRACE_CONT and stopping at the next system call entry/exit.

In the previous example, we used PTRACE_PEEKUSER to look into the arguments of the write system call. When a system call returns, the return value is placed in %eax, and it can be read as shown in that example.

The status variable in the wait call is used to check whether the child has exited. This is the typical way to check whether the child has been stopped by ptrace or was able to exit. For more details on macros like WIFEXITED, see the wait(2) man page.

Reading Register Values

If you want to read register values at the time of a syscall entry or exit, the procedure shown above can be cumbersome. Calling ptrace with a first argument of PTRACE_GETREGS will place all the registers in a single call.

The code to fetch register values looks like this:

#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <linux/user.h>
#include <sys/syscall.h>
int main()
{   pid_t child;
    long orig_eax, eax;
    long params[3];
    int status;
    int insyscall = 0;
    struct user_regs_struct regs;
    child = fork();
    if(child == 0) {
        ptrace(PTRACE_TRACEME, 0, NULL, NULL);
        execl("/bin/ls", "ls", NULL);
    }
    else {
       while(1) {
          wait(&status);
          if(WIFEXITED(status))
              break;
          orig_eax = ptrace(PTRACE_PEEKUSER,
                            child, 4 * ORIG_EAX,
                            NULL);
          if(orig_eax == SYS_write) {
              if(insyscall == 0) {
                 /* Syscall entry */
                 insyscall = 1;
                 ptrace(PTRACE_GETREGS, child,
                        NULL, &regs);
                 printf("Write called with "
                        "%ld, %ld, %ld\n",
                        regs.ebx, regs.ecx,
                        regs.edx);
             }
             else { /* Syscall exit */
                 eax = ptrace(PTRACE_PEEKUSER,
                              child, 4 * EAX,
                              NULL);
                 printf("Write returned "
                        "with %ld\n", eax);
                 insyscall = 0;
             }
          }
          ptrace(PTRACE_SYSCALL, child,
                 NULL, NULL);
       }
   }
   return 0;
}

This code is similar to the previous example except for the call to ptrace with PTRACE_GETREGS. Here we have made use of the user_regs_struct defined in <linux/user.h> to read the register values.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Ptrace for multi-thread application

Solon Chen's picture

After struggled a long time, I got a true way to make my ptrace worked correct with multi-thread application. Here're my sample codes, hope it can help others whom have the same confusion.


char trapCode[] = {0, 0, 0, 0};
int status;

ptrace(PTRACE_ATTACH, childProcess, NULL, NULL); //childProcess is the main thread
wait(NULL);

printf("\nchild %d created\n", childProcess);
fflush(stdout);

long ptraceOption = PTRACE_O_TRACECLONE;
ptrace(PTRACE_SETOPTIONS, childProcess, NULL, ptraceOption);

struct user_regs_struct regs;

for(unsigned int i = 0; i < m_breakPoints.size(); i++)
{
BreakPoint_Info breakPointInfo = m_breakPoints[i];
if(!breakPointInfo.m_enabled)
continue;

unsigned int index = breakPointInfo.m_checkPointIndex;
if(m_bytesBackup.find(m_checkPoints[index].m_offset) != m_bytesBackup.end())
continue;

unsigned long readAddr = m_checkPoints[index].m_offset;
One_Byte_With_Result *oneByte = new One_Byte_With_Result;
getData(childProcess, readAddr, trapCode, 4);
oneByte->m_char = trapCode[0];
trapCode[0] = 0xcc;
putData(childProcess, readAddr, trapCode, 4);

m_bytesBackup.insert(std::make_pair(m_checkPoints[index].m_offset, oneByte));
}

std::set allThreads;
std::set::iterator allThreadsIter;
allThreads.insert(childProcess);

int rec = ptrace(PTRACE_CONT, childProcess, NULL, NULL);

while(true)
{
pid_t child_waited = waitpid(-1, &status, __WALL);

if(child_waited == -1)
break;

if(allThreads.find(child_waited) == allThreads.end())
{
printf("\nreceived unknown child %d\t", child_waited);
break;
}

if(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP)
{
pid_t new_child;
if(((status >> 16) & 0xffff) == PTRACE_EVENT_CLONE)
{
if(ptrace(PTRACE_GETEVENTMSG, child_waited, 0, &new_child) != -1)
{
allThreads.insert(new_child);
ptrace(PTRACE_CONT, new_child, 0, 0);

printf("\nchild %d created\t", new_child);
}
ptrace(PTRACE_CONT, child_waited, 0, 0);
continue;
}
}

if(WIFEXITED(status))
{
allThreads.erase(child_waited);
printf("\nchild %d exited with status %d\t", child_waited, WEXITSTATUS(status));

if(allThreads.size() == 0)
break;
}
else if(WIFSIGNALED(status))
{
allThreads.erase(child_waited);
printf("\nchild %d killed by signal %d\t", child_waited, WTERMSIG(status));

if(allThreads.size() == 0)
break;
}
else if(WIFSTOPPED(status))
{
int stopCode = WSTOPSIG(status);
if(stopCode == SIGTRAP)
{
ptrace(PTRACE_GETREGS, child_waited, NULL, &regs);
unsigned long currentEip = regs.eip;
//printf("%d\t%08x\n", child_waited, currentEip);

Address_Bytes_Map::iterator iter = m_bytesBackup.find(currentEip - 1);
if(iter != m_bytesBackup.end())
{
iter->second->m_result = true;

regs.eip = regs.eip - 1;
getData(child_waited, regs.eip, trapCode, 4);
trapCode[0] = iter->second->m_char;
putData(child_waited, regs.eip, trapCode, 4);
rec = ptrace(PTRACE_SETREGS, child_waited, NULL, &regs);
}
}
}

rec = ptrace(PTRACE_CONT, child_waited, 1, NULL);

continue;
}

Breakpoint at line number

eager's picture

Karan Verma asked how to put a breakpoint at a particular line number. This is exactly what a debugger does.

There are many nuances and details, but I'll simplify this to the bare minimum:

1) Find where the program is loaded in memory.
2) Read the executable and locate the debug data which corresponds to the source file containing the line at which you want to place the breakpoint.
3) Interpret the line number table in the debug data to locate the address which corresponds to the desired line.
4) Adjust the address from (3) to make it correspond with (1) if necessary.
5) Copy and save the original instruction at the breakpoint address.
6) Write the breakpoint instruction at the breakpoint address

Naturally, this is usually done when the child process is stopped. After inserting the breakpoint, the child is allowed to run.

When the breakpoint is executed, the child process will stop and the parent will receive a signal. The parent process needs to replace the breakpoint instruction the original instruction before it can allow the child process to continue.

GDB and LLDB are open source debuggers which give examples of how this is done. Reading GDB is not for the faint of heart -- there's a lot of complexity in handling many different object file formats and many different target architectures.

Michael Eager, Consultant, Embedded Systems, Compilers, Debuggers

Walking the call stack

eager's picture

Sandeep mentioned wanting to use ptrace() to walk the call stack.

Ptrace() provides access to the memory and registers of a child process. It doesn't tell you how that memory is organized.

Information on how the stack is organized is usually contained in the ABI (Application Binary Interface) for each processor. DWARF debugging information (see dwarfstd.org) Call Frame Information (CFI) describes where each call has saved registers.

If you want to write a routine which walks the call stack, I suggest that you start with one which will walk the stack in the current process and later convert it to accessing a child process. The first step is to find the start of the current stack frame, then find the previous frame.

Michael Eager, Consultant, Embedded Systems, Compilers, Debuggers

Answer: Where is ORIG_EAX?

eager's picture

ORIG_EAX is in . Replace with this path. (Oh, and add #include to avoid compiler complaints about printf definition conflicts.)

Michael Eager, Consultant, Embedded Systems, Compilers, Debuggers

Answer: Where is ORIG_EAX?

eager's picture

ORIG_EAX is in . Replace with this path. (Oh, and add #include to avoid compiler complaints about printf definition conflicts.)

Michael Eager, Consultant, Embedded Systems, Compilers, Debuggers

ptrace multi thread application

Solon Chen's picture

This article helps me a lot. I'm trying to create a line coverage tool using ptrace.

One problem is ptrace only resolve single thread, and I don't know how to deal with multi thread application.

I set option to catch clone event, it can help me to find all lwp's pid.
I also try to make all thread continue, but it seems only child thread will go on until sleep. The parent thread has no affect on continue command, /proc//status shows it in "tracing stop".

Could you tell me how to make all thread continue? (I have restore the breakpoint in runtime memory)

Thanks.

Pradeep, this example sucks!

Anonymous's picture

This example is useless. The author try to impress with complicated function calls which are absolutely useless.
Joe

Sounds to me like your

Anonymous's picture

Sounds to me like your frustrated with your incompetence

Compilation Errors for some constants

qtp's picture

hi,
i am getting Compilation Errors for few constants, like "ORIG_EAX".

it is probably in User.h. but when i search for user.h, i dont fid it declared there.

can i define these variables to some value from my code? which values should i assign and what is the significance of those?

thanks.

Excellent article

Anonymous's picture

Thanks Pradeep, great intro to ptrace().

I've used strace() for debugging for a couple of years and never knew this was what it used; I'm hoping to create a simulator using ptrace() soon for automatic integration testing of an embedded project and your article is a great start to see some real code :-)

--rob

C++ stack trace on linux

Anonymous's picture

Hi,

i am working on an application which can dump call stack of C++ for me.
i am trying to use ptrace() for the same but not getting proper direction or the steps i must follow for the same.

could someone guide me on the same?

Thanks.
sandeep

Error in putdata()

Aliphany's picture

Please be aware that putdata() contains a serious mistake. If len is not a multiple of four, then putdata() should read the final long value, replace one, two, or three bytes, and then write the value. This mistake causes the second example in Part II to seg fault.

Thanks

ChrisO's picture

Great article, thanks!

One question though, why do you multiply the addresses of registers by 4 when reading them with PTRACE_PEEKDATA?

Indexes not addresses

Mitch Frazier's picture

The register "addresses" you're referring to (EAX, EBX, etc) are indexes:

#define EBX 0
#define ECX 1
#define EDX 2
#define ESI 3
#define EDI 4
...

into the user data:

struct user_regs_struct
{
  long int ebx;
  long int ecx;
  long int edx;
  long int esi;
  long int edi;
  ...

So to get an address, multiply by 4 (the "word" size on a x86 system).

Mitch Frazier is an Associate Editor for Linux Journal.

ptrace and threads

Anonymous's picture

I'm posting this hoping the next fellow who encounters this gotcha can save a little time...

ptrace only works in the base thread of the parent process. ptrace(PTRACE_CONT, pid) will fail with ESRCH (process not found) if issued in a child thread on Linux.

If you are thinking of using a debugger thread to watch each child thread, give it up. It won't work. And unless you find this message or have a sudden epiphany, you are liable to spend a great deal of time bashing your poor head against the wall.

Google, take it from here!

line numbers ptrace

Karan Verma's picture

Hi

If we want to put a breakpoint at a particular line number, how would we accomplish that using ptrace?

Hi Pradeep, Thanks a

Patel's picture

Hi Pradeep,

Thanks a lot for this article, this is the one which helped me to write my own tool, to capture all the system calls(including calls from forked/child processes too). Thanks again!!!

~Patel

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState