malloc | wtf

There are some cases about heap you want.

0x01 System call - mmap, brk

malloc get memory firstly

system calls - ask the kernel directly
- mmap()
  - ask kernel give us some new virtual addresses
- brk()
  - change data segment size

process doesn’t care how the memory is implemented.

allocated memory through Linux Kernel and CPU's MMU etc map to process. Then process can transparently access these memory.

In CTF match, we use very little space to malloc, so brk system call doesn’t appear.

How malloc work?

0x02 malloc implement

Easily we consider the large region as heap.

Now, we want a size 8 memory to write some chars like AAAABBBB. use malloc(8). Then return a address can be writed.

+—————————+—————————+—————————+—————————+ <- 0x804000
| A A A A | B B B B |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804010
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804020
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804030
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804040

If malloc(8) again, what will happen

guess

+—————————+—————————+—————————+—————————+ <- 0x804000
| A A A A | B B B B | A A A A | B B B B |
+—————————+—————————+—————————+—————————+ <- 0x804010
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804020
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804030
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804040

if this is reality, some questions here.

how does malloc know what address to return?
how does malloc know which areas are still free?

some implement here.

DLmalloc

most common use in malloc.

workflow

store for each chunk it blocks the size of chunk right before it.
keep 4 bytes before chunk (before freed)

			   ret_addr: 0x804008
+—————————+—————————+—————————+—————————+ <- 0x804000
|         |size:0x10| A A A A | B B B B |
+—————————+—————————+—————————+—————————+ <- 0x804010
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804020
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804030
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804040

first malloc return address can be calculated equal 0x804000+0x8=0x804008

Then, how does malloc know the next address after called where the next chunk can be placed?

If start of the free/available area: 0x804000, next chunk address at start + now_chunk_size, so at 0x804000 + 0x10(size) = 0x804010. The will happen at malloc of system calls.

So there is a point somewhere at always point to a free memory.

Given the returned address of malloc, what can we do with that?

Writable data addr: this pointer points to the start where we can write data to.
Size of chunk: before writable data addr.
Next chunk addr: start of current chunk addr add size of this chunk

0x03 small code cases

heap1.c

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#include <time.h>
#include <sys/types.h>



struct internet {
    int     priority;
    char    *name;
};

void winner()
{
    printf("and we have a winner @ %d\n", time(NULL));
}

int main(int argc, char **argv)
{
    struct internet *i1, *i2, *i3;

    i1 = malloc(sizeof(struct internet));
    i1->priority = 1;
    i1->name = malloc(8);

    i2 = malloc(sizeof(struct internet));
    i2->priority = 1;
    i2->name = malloc(8);

    strcpy(i1->name, argv[1]);
    strcpy(i2->name, argv[2]);

    printf("and that's a wrap folks\n");
    return 0;
}

objective -> winner
struct internet
- first member: priority, (type): int
- second member: name, (type): char point (means contain a point to a string somewhere else)

*pointer(meaning: contains an address)

In 32-Bit machine, point size(a address size) equal 4 bits, but 8 bits in 64-Bit machine.

malloc(sizeof(struct internet))So a chunk size: 4(int) + 4(char *) = 8 (tests in 32Bit) i1->priority = 1; write 1 to the first 4 byte of the allocated area.

				 int priority  char *name
+—————————+—————————+—————————+—————————+ <- 0x804d198(i1=0x804d198+8)
|         |size:0x10|   0x1   |         |
+—————————+—————————+—————————+—————————+ <- 0x804d1a8
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804d1b8
|         |         |         |         |
+—————————+—————————+—————————+—————————+ <- 0x804d1c8

name point at offset = 4 of i1 object(name point at: i1 + 4).i1->name = malloc(8);Then it allocates another 8 byte, and the resulting address will stored in the char point name. Those 8 bytes are intended to store a list of characters.

As a programmer, we use i1->name to access name member property. As simply going to a certain offset of i1, in this case i1+4 is the location of the char point name. i1->name [(0x804d198+8)+4]. [num] express access address in the num address which means fetch info indirectly.

pwndbg> heap
Allocated chunk | PREV_INUSE
Addr: 0x804d008
Size: 0x191

Allocated chunk | PREV_INUSE
Addr: 0x804d198
Size: 0x11

Top chunk | PREV_INUSE
Addr: 0x804d1a8
Size: 0x21e59

pwndbg> x/12wx 0x804d198
0x804d198:      0x00000000      0x00000011      0x00000000      0x00000000
0x804d1a8:      0x00000000      0x00021e59      0x00000000      0x00000000

i2 looks like i1.

We enable the argv[1] = "aaaabbbb", argv[2] = "aaaabbbb";, so heap info behind.

pwndbg> r aaaabbbb aaaabbbb
pwndbg> ...
pwndbg> x/22wx 0x804d198
0x804d198:      0x00000000      0x00000011      0x00000001      0x0804d1b0 <- i1's priority, *name
0x804d1a8:      0x00000000      0x00000011      0x61616161      0x62626262 <- il's name content
0x804d1b8:      0x00000000      0x00000011      0x00000001      0x0804d1d0 <- i2's priority, *name
0x804d1c8:      0x00000000      0x00000011      0x61616161      0x62626262 <- i2's name content
0x804d1d8:      0x00000000      0x00021e29      0x00000000      0x00000000

a dangerous function! strcpy

strcpy has no length check. So we can overflow name when write over 8 byte and really screw up stuff.

size low bit indicate that the PREVIOUS chunk is used, so we find it is 0x11 not 0x10. That becomes more important for the free()

The dlmalloc is not really the original dlmalloc. It is usually referred to as ptmalloc.

0x04 overwrite

heap overflow works not well. overwrite somewhat? It actually is called got overwrite.

GOT - _GLOBAL_OFFSET_TABLE_ : record function offset in program running.

function call use the plt addr.

-> 0x08049312 <+192>:   call   0x80490e0 <puts@plt>
pwndbg> disassemble 0x80490e0
Dump of assembler code for function puts@plt:
   0x080490e0 <+0>:     endbr32 
   0x080490e4 <+4>:     jmp    DWORD PTR ds:0x804c01c  <-puts@got
   0x080490ea <+10>:    nop    WORD PTR [eax+eax*1+0x0]
End of assembler dump.

we need pad some chars to overwrite the i2 name point address.

it looks like this.