The preauthentication remote hole in sshd's challenge-response code is a
heap-based overflow resulting from an integer overflow.

input_userauth_info_response() in auth2-chall.c:

258         nresp = packet_get_int();
259         if (nresp > 0) {
260                 response = xmalloc(nresp * sizeof(char*));
261                 for (i = 0; i < nresp; i++)
262                         response[i] = packet_get_string(NULL);
263         }

The variable 'nresp' is supplied by the ssh client and can be arbitrarily
large. On x86, 'nresp' is a 4 byte unsigned integer with a maximum value
(UINT_MAX) of 0xffffffff. Also on x86, the size of a pointer is 4 bytes. The
argument to xmalloc() overflows when 'nresp' is at least 0xffffffff / 4, or
0x40000000. The function xmalloc() is a traditional wrapper around malloc()
and will henceforth be referred to as malloc(). The allocation size can be
forced easily through bogus values of 'nresp'.

Example:

	0x40000000 + 1
		= 0x40000001 
	0x40000001 * 4
		= 0x100000004
	0x100000004 is obviously too large to fit in a 32-bit value, so
	truncation reduces it to 0x00000004. In this way, 4 bytes are allocated
	through the xmalloc() wrapper.

The for() loop, however, will see the original value of 'nresp' which can be
much larger than the number of bytes allocated previously. There can be an
overflow of malloc()'d memory.

This is not of that class of malloc()-based bugs where execution hijacking
can result from careful manipulation of malloc() administration chunks.
There are two important details to bear in mind:

1. OpenBSD and FreeBSD use the dynamic memory allocation routines written by
   Poul-Henning Kamp.

2. 'response[i]' receives pointers returned from other calls to malloc().
Therefore, the overwriting values can't be controlled, practically. If the
buffer were on the stack, we could hope to have one of the pointers
overwrite a saved instruction pointer, thus providing offset-independent
redirection (for your hacking knowledge base, many programs are vulnerable
to stack-based pointer array overflows of this kind). We still are afforded
offset-independence here, but we have to search harder for something to
overwrite... a function pointer perhaps... but what are the chances? We'll
put aside that problem for now.


Back on track...

261                 for (i = 0; i < nresp; i++)
262                         response[i] = packet_get_string(NULL);

Given that 'nresp' should be at least 0x40000001 (1073741825), our first task
is to break out of this loop somehow; this saves us an incredible amount of
data transfer. It's observed that sshd has no SIGURG handler.

We're forced to examine packet_get_string() in detail:

1131 void *
1132 packet_get_string(u_int *length_ptr)
     /* [<][>][^][v][top][bottom][index][help] */
1133 {
1134         return buffer_get_string(&incoming_packet, length_ptr);
1135 }


203 void *
204 buffer_get_string(Buffer *buffer, u_int *length_ptr)
     /* [<][>][^][v][top][bottom][index][help] */
205 {
206         u_int len;
207         u_char *value;
208         /* Get the length. */
209         len = buffer_get_int(buffer);
210         if (len > 256 * 1024)
211                 fatal("buffer_get_string: bad string length %d", len);
212         /* Allocate space for the string.  Add one byte for a null
character. */
213         value = xmalloc(len + 1);
214         /* Get the string. */
215         buffer_get(buffer, value, len);
216         /* Append a null character to make processing easier. */
217         value[len] = 0;
218         /* Optionally return the length of the string. */
219         if (length_ptr)
220                 *length_ptr = len;
221         return value;
222 }

After examining the possibilities, it seems there's only one viable means of
early loop exit: if the length field of the string in the packet specifies a
value greater than 256 kilobytes (we use 257 kilobytes in sshconnect2.c), a
call to fatal() is made. This doesn't sound too good, though. But let's
see...

32 void
33 fatal(const char *fmt,...)
     /* [<][>][^][v][top][bottom][index][help] */
34 {
35         va_list args;
36         va_start(args, fmt);
37         do_log(SYSLOG_LEVEL_FATAL, fmt, args);
38         va_end(args);
39         fatal_cleanup();
40 }

do_log() isn't interesting. Moving on to fatal_cleanup()...

172 /* Fatal cleanup */
173 
174 struct fatal_cleanup {
175         struct fatal_cleanup *next;
176         void (*proc) (void *);
177         void *context;
178 };
179 
180 static struct fatal_cleanup *fatal_cleanups = NULL;

[...]

216 void
217 fatal_cleanup(void)
     /* [<][>][^][v][top][bottom][index][help] */
218 {
219         struct fatal_cleanup *cu, *next_cu;
220         static int called = 0;
221 
222         if (called)
223                 exit(255);
224         called = 1;
225         /* Call cleanup functions. */
226         for (cu = fatal_cleanups; cu; cu = next_cu) {
227                 next_cu = cu->next;
228                 debug("Calling cleanup 0x%lx(0x%lx)",
229                     (u_long) cu->proc, (u_long) cu->context);
230                 (*cu->proc) (cu->context);
231         }
232         exit(255);
233 }


Lovely. A function pointer. Examining sshd source in further detail, it can
be seen that for each cleanup function, a 'fatal_cleanup' structure is
dynamically allocated on the heap. One such cleanup function,
packet_close(), is always registered. The function packet_close() is what
gets invoked on line 230 above. It isn't interesting.

What we need to do is overwrite the function pointer in the 'fatal_cleanup'
structure that corresponds to packet_close(). To do this we need the
'response' buffer to be allocated at a lower memory address than that used
for the cleanup structure. This requirement is closely tied to the memory
profile of the process, but given the general way in which phk malloc
operates, we can assume that we'll need to reuse a free()'d memory chunk. To
do this, we have to be very particular about our 'nresp' value.

Unfortunately, the memory profile varies considerably depending on which
keyboard-interactive device is used -- skey or bsdauth. The ssh(1) patch
submitted to Bugtraq by another party reuses a 4096 byte chunk (1 page),
which seems to work fine in most cases when using bsdauth. However, results
are mixed when using skey. Nevertheless, we have devised an extremely simple
procedure to follow that should result in almost flawless exploitation of
vulnerable sshd daemons on at least OpenBSD. You can read all about it in
the HOWTO file.

