Page 1 of 1

itoa() not returning correct strings (RISC-V)

Posted: Sat Jan 22, 2022 7:44 am
by kspatlas
I'm having a weird bug with my itoa() implementation, where it seems to output random data.

Normal:

Code: Select all

decimal: 69
hex: deadbeef
actual:

Code: Select all

decimal:
hex: 7?6?>3115
I have tried many itoa() implementations, but all suffer from a similar issue. Here is the implementation I'm using right now:

Code: Select all

#define _abs(x) ((x < 0) ? (-x) : (x))
void _reverse(char* s) {
        int c, i, j;

        for (i = 0, j = sizeof(s)-1; i < j; i++, j--)
                c = s[i], s[i] = s[j], s[j] = c;
}

void itoa(int n, int base, char* s) {
        int i, sign;
        sign = n;
        i = 0;
        do
                s[i++] = _abs(n%base) + '0';
        while ((n/=10) != 0);

        if(sign < 0 && base == 10)
                s[i++] = '-';
        s[i] = '\0';

        _reverse(s);
}
This is just after booting from SBI, no paging or anything.

Re: itoa() not returning correct strings (RISC-V)

Posted: Sat Jan 22, 2022 11:39 am
by kzinti

Code: Select all

sizeof(char*) == 4

Re: itoa() not returning correct strings (RISC-V)

Posted: Sat Jan 22, 2022 2:08 pm
by nullplan
You know, your version of itoa is really bad. There is no limit on the array "s" given to the function, so no way to limit the number of writes into the array. Beyond what kzinti pointed out, writing to memory twice is just fundamentally bad. And you are giving a variable base to the function, but then end up not using it in two places. Using abs() in the loop is also unnecessary, you can factor that out.

Indeed writing a version that works under all circumstances is an interesting challenge. You can't necessarily turn a negative number positive, but you can always do it the other way around. That does mean working with negative numbers. The C standard allows two ways integer division can work with negative numbers (essentially, rounding the quotient towards zero or towards negative infinity), but in practice I have only ever seen the former. So I would likely do it like this:

Code: Select all

#if -5 % 3 != -2
#error "Your compiler is not using symmetric modulo. Adjust this code."
#endif
#include <stddef.h>
#include <assert.h>
int itoa(int a, int base, char *s, size_t len)
{
  int sgn = 0;
  if (a >= 0) a = -a;
  else sgn = 1;
  assert(base - 2u < 34u);

  size_t num_dig = 0;
  int x = a;
  do {
    num_dig++;
    x /= base;
  } while (x);
  if (sgn) num_dig++;
  if (num_dig >= len) return -1;
  s[num_dig--] = 0;
  do {
    s[num_dig] = '0' - a % base;
    if (s[num_dig] > '9') s[num_dig] += 'a' - '9' + 1;
    num_dig--;
    a /= base;
  } while (a);
  if (sgn) *s = '-';
  return 0;
}
Yes, the code assumes an ASCII machine. It also assumes symmetric modulo. Adjusting for mathematical modulo is not easy. In that case, repeated division would raise the number to -1 rather than 0, and the modulo would return the exact opposite digit, except if it returns zero. Thankfully that case is rare.

Re: itoa() not returning correct strings (RISC-V)

Posted: Sat Feb 05, 2022 5:45 am
by kspatlas
nullplan wrote:You know, your version of itoa is really bad. There is no limit on the array "s" given to the function, so no way to limit the number of writes into the array. Beyond what kzinti pointed out, writing to memory twice is just fundamentally bad. And you are giving a variable base to the function, but then end up not using it in two places. Using abs() in the loop is also unnecessary, you can factor that out.

Indeed writing a version that works under all circumstances is an interesting challenge. You can't necessarily turn a negative number positive, but you can always do it the other way around. That does mean working with negative numbers. The C standard allows two ways integer division can work with negative numbers (essentially, rounding the quotient towards zero or towards negative infinity), but in practice I have only ever seen the former. So I would likely do it like this:

Code: Select all

#if -5 % 3 != -2
#error "Your compiler is not using symmetric modulo. Adjust this code."
#endif
#include <stddef.h>
#include <assert.h>
int itoa(int a, int base, char *s, size_t len)
{
  int sgn = 0;
  if (a >= 0) a = -a;
  else sgn = 1;
  assert(base - 2u < 34u);

  size_t num_dig = 0;
  int x = a;
  do {
    num_dig++;
    x /= base;
  } while (x);
  if (sgn) num_dig++;
  if (num_dig >= len) return -1;
  s[num_dig--] = 0;
  do {
    s[num_dig] = '0' - a % base;
    if (s[num_dig] > '9') s[num_dig] += 'a' - '9' + 1;
    num_dig--;
    a /= base;
  } while (a);
  if (sgn) *s = '-';
  return 0;
}
Yes, the code assumes an ASCII machine. It also assumes symmetric modulo. Adjusting for mathematical modulo is not easy. In that case, repeated division would raise the number to -1 rather than 0, and the modulo would return the exact opposite digit, except if it returns zero. Thankfully that case is rare.
So i tried that, and it works, but is there any way to disable negative numbers for bases other than 10?

Re: itoa() not returning correct strings (RISC-V)

Posted: Sun Feb 06, 2022 9:42 pm
by nullplan
kspatlas wrote:So i tried that, and it works, but is there any way to disable negative numbers for bases other than 10?
Not really. I suggest making a second function to print unsigned numbers. E.g.

Code: Select all

#include <stddef.h>
#include <assert.h>
int utoa(unsigned a, int base, char *s, size_t len)
{
  assert(base - 2u < 34u);

  size_t num_dig = 0;
  unsigned x = a;
  do {
    num_dig++;
    x /= base;
  } while (x);
  if (num_dig >= len) return -1;
  s[num_dig--] = 0;
  do {
    s[num_dig] = '0' + a % base;
    if (s[num_dig] > '9') s[num_dig] += 'a' - '9' + 1;
    num_dig--;
    a /= base;
  } while (a);
  return 0;
}
Which, for the most part, is the same function as above, but with some types changes and some code removed. And the minus in the line that generates the digits changed into a plus.

I don't think you will be able to do much better. Although, if you really only want to use bases 10 and 16, you might be better off just creating functions with those specific bases hardcoded. That also makes it way easier for the powers of two, because then you only need to shift and mask rather than divide and modulo.