Version 0.1.1 - bug fixes and windows horror fonts

Jens
Odd thing, the more you test a build the more you find bugs.

I found a couple of bugs in BEdit so I decided to fix the really annoying ones and record the rest, up the version number, and of course make new builds. This also normalizes the version number between the command line and GUI version. Latest version is now 0.1.1

As a side-note, dealing with Windows fonts is pure pain! In a perfect world, a font would be a file (and it is a file) that you can load using your own preferred library. Alas, Windows has other ideas. As far as I can tell, there is no easy way to get the path of an installed font on Windows using Windows API, sure - you can enumerate the %windir%\Fonts (that's what I started with) but the info you get from that is less than optimal. So how about just getting the raw data? Yea, that's kinda possible... kinda. BEdit is using stb_truetype to load its fonts, and that has been working fine, so when I get a crash in that library I assume the error is the data I'm feeding it. Well how do I make sure the data I'm feeding it from Windows API is the same data as the file contains? ... If I only had a binary file viewer ... Long side-note short, BEdit now supports (at least the ones I have installed) OS fonts.

Command line version is (as always) available right here or handmade network, on the main page of BEdit, and the GUI is avaiable on itch io.

I tried to use bedit (the command line version) to display my profile file. Here is some feedback:

- When displaying an array, pad the indices to the number of digits required by the biggest index.
1
2
3
4
5
some_array[ 9] 10
some_array[10] 10
/* Instead of */
some_array[9] 10
some_array[10] 10


- Support for ++ -- -= += ... operator would be nice;

- At some point in my file I have a bunch of strings, and to display them I needed the address/offset of the current location in the file. I don't know if that's possible to do or if there is a better way to do it.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
struct profile_meta {
    u32 timeline_id_count;
    u32 timeline_count;
    u32 event_id_count;
    u64 timeline_id_string_buffer_size;
    u64 event_id_string_buffer_size;
    u16 timeline_id_string_lengths[ timeline_id_count ];
    // u8 timeline_id_string_bytes[ timeline_id_string_buffer_size ];
    hidden( timeline_id_string_buffer_size ) timeline_id_string_bytes;

    var address = 40 + 4 + 4 + 4 + 8 + 8 + 2 * timeline_id_count; /* Hardcoding the value, 40 is the size of a header before the meta data. */
    // var address = &timeline_id_string_bytes; <-- how to do this ?
   /* There is another similar array latter and it's more complicated to hard code the offset. */

    for ( var i = 0; i < timeline_id_count; i = i + 1 ) {
       @( address ) string( timeline_id_string_lengths[ i ] ) id_string;
       address = address + timeline_id_string_lengths[ i ];
    }
    ...
};


- Also the address/offset is required if I want to compute a padding for alignment. Or is there a way to specify that some data needs to be aligned to a certain value ? This also may imply that an element could have a size of zero if I compute a padding for the alignment and the data is already aligned.

- I would like more control of the print out. In the example above:
1
2
3
4
5
6
7
8
/* the printout is */
id_string "A string"
id_string "Another string"
id_string "A third string"
/* But I would like it to be */
id_string[ 0 ] "A string"
id_string[ 1 ] "Another string"
id_string[ 2 ] "A third string"


- In a similar way:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
@( header.timelines_offset ) profile_events events[ header.timelines_size / 16 ];
/* Will display a bunch of */
profile_events (.events[0])
05 B0h                         cycles 52 791 352 285 496 
05 B8h                             id                 58 
05 BCh                          flags                  1 

profile_events (.events[1])
05 C0h                         cycles 52 794 148 593 812 
05 C8h                             id                 58 
05 CCh                          flags                  2 
/* But I would like to
- Not have the "profile_events (.events[index])"
- Not have the blank line
- A way to layout the 3 fields on a single line
- A way to display the index myself on the same line

So probably what I want is a "table display" type.
*/


- I would like a way to display the arrays that is not 1 value per line. For example a way to specify 16 values on a single line separated by comma, possibly with columns number at the top to be able to identify a single value;
1
2
3
                              0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
0x00 some_array [  0 -> 15 ] 10 12 60 47 32  7 98 72 44 11  2 13 25 75 53 47
0x10 some_array [ 16 -> 31 ] 10 12 60 47 32  7 98 72 44 11  2 13 25 75 53 47


- I would like to be able to use sizeof( struct_type );

- I display a long string in my layout and that messes up the layout. First It doesn't display the whole string. Second it display the string on two lines, which means the rest of the layout is also on two lines and that becomes unreadable. I suppose it would either requires the user to specify the maximum number of character per row, or for you to retrieve the command line width.

- I would like the "hidden" thing to be an attribute on a regular declaration, so that I can just sometimes hide stuff easily.

- I think using a more "modern" way to declare variable would be great. For example using the Odin/Jai way of having the name at the left and the type, size etc at the right. It would, hopefully, make it easier to modify expression.

Here is my test files if you'd like to have a look. Some lines are commented
Thank you for the feedback, feedback is one of the biggest thing I'm lacking at this point.

I haven't looked at the files you linked yet, but some (semi-)quick comments you might find useful:

When displaying an array, pad the indices to the number of digits required by the biggest index.
- To be honest, the command line version had this feature and I thought it was still around. I'll re-enable it unless I can remember why it's not there anymore.

Support for ++ -- -= += ... operator would be nice
- This is a definite agree, and it is planned for the next time I do anything with the layout language itself. I'll add an improvement ticket to github.

... I needed the address/offset of the current location in the file. I don't know if that's possible to do or if there is a better way to do it.
- It is possible! It was one of the more recent features added. To get the current address you can use current_address(), at the same time I added size_of_file. You can see how it's used in png bet in the examples folder. I have considered the & operator to get the address of the member as well - but I haven't added anything like that yet. For a more complete documentation of the language I wrote some docs, that file is also a part of the package you get when you download the GUI application.

Regarding your next point, the id_string in the example, it brings up an interesting point I don't know how to solve yet.

The problem is a "repeating member" vs "an array". Technically they're implemented the same way - an array in BEdit is just a "for i"-loop, but the problem comes with representation. Therefor there's a tag that gets applied to arrays telling the command line and GUI that "this was declared as an array". I was thinking of maybe having something like:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
struct Foo
{
    u(4) a[2];
    for (var i = 0; i < 2; i = i + 1)
    {
        u(4) b[2];
        u(4) c;
    }
};

- Would produce -

a[0]   123
a[1]   456
b#0[0] 123
b#0[1] 456
c#0    123
b#1[0] 123
b#1[1] 456
c#1    123

to disambiguate between the cases, but let's say the jury is still out on that one... Maybe a repeating member should just pretend it adds to an array, but that might lead to confusion if you have multiple arrays that are repeating members. Arrays and repeating members is still an unsolved problem, but a known one.

The display of "small enough" members (and arrays) being on separate lines is a known annoyance, not only for the command line but also for the GUI (especially the hierarchy view). It's very clear when you have a vec3 type that the members being on different indent levels bring no value. I've been playing around with having internal as a keyword to indicate that "this member should be interpreted as non-nested", but that wouldn't solve the array case you mentioned.

I would like to be able to use sizeof( struct_type )
- So would I, but it's tricky - but it would be really useful. One of the reasons I haven't implemented it is because the size of the type can depend on.. well.. everything. Unlike C the type can have parameters, and it can also have a size determined on the value of its members. Hence a sizeof operator isn't trivial to implement. However, most struct sizes don't have dependencies on the data file being evaluated, and for those it's trivial to determine the size. I added this issue to keep track of the feature request. Even in the case where the size of the type depends on the current address, it's not impossible to implement (maybe sizeof(@(my_address) MyStruct)?).

I would like the "hidden" thing to be an attribute on a regular declaration, ...
- Not a bad idea.

I think using a more "modern" way to declare variable would be great...
- Ah, it would certainly make it easier to parse! Joking aside, the reason it's "like a C language" is to make it easier to learn for C/C++ programmers - a knowledge-set I presume most people who deal with binary files have. Of course there's nothing preventing us from implementing multiple "languages" as both the command line and GUI only uses the intermediate representation (and some line information for debugging purposes), but that's not on the roadmap at this point in time... but the layout language is in public domain ;)

I'll have a look at the files you linked later this week if time permits (I do have some BEdit plans for the weekend unless I chicken-out).
I updated the file as there were an easier way to do what I wanted.

A few other notes:

I would like a way to set the current offset without defining any member. For example to set up the current offset before defining members in a loop. At the moment I used the following.
1
2
3
4
@( header.timelines_offset - 1 ) hidden( 1 ) padding;
/* But */
@( header.timelines_offset );
/* Would be nice to have. */


I would like print to be a "core" feature, not a debug thing. At the moment (in the command line version) it outputs the " around strings and as far as I know I can't just print a number alone. I don't need printf formatting, if there are print_string, print_integer, print_float functions, I'm satisfied.
I would like to be able to do
1
2
3
4
5
6
7
8
9
print( "\nTimeline [ " );
print( i );
print( " ]\n" );

/* To outputs:

Timeline [ 0 ]

*/
I've been thinking of having the address directly writable, something along the lines of current_address = blah; but I do like using the @ as it's a bit more consistent. Maybe the address specifier shouldn't be tied to member declarations at all, something like
1
2
3
4
@(foo.bar);
u(4) member;

// Instead of '@(foo.bar) u(4) member;'

I do like that more than current_address.

Another option is to treat @ as a variable (in reality it is) that indicates the current address,
1
2
@ = foo.bar;
u(4) member;


... or when in doubt, support both! Trailing semicolon could even be optional.

Regarding print.. I did some parsing of protobuf content earlier today and realized the print needs to be much more powerful (also function support is needed, something for version 0.1.4 I think). I'm considering something like f-strings of Python 3
1
print("## Timeline[ {index:u} ] ##"); // 'u' indicating it's to be displayed as unsigned decimal.

I do believe swift has something similar (string interpolation).

For the protobuf use-case it's a bit different though as that has a member that is encoded, and what you really want to see is the member as decoded. Once functions are in BEdit proper a decoder specifier could be added.

And so I don't forget, and so everybody can keep track of it: #1 and #2

As a random question regarding the files you linked, any reason why you decided not to declare the profile_event.flags as an enum type?
jens
As a random question regarding the files you linked, any reason why you decided not to declare the profile_event.flags as an enum type?


No reason, it was just not important to me at the moment. So I just tried and it doesn't work well as those are flags, meaning the value can be combined.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
enum profiler_event_flag {
    /* I generally define flags using the shift left operator, but it's not supported. */
    /*
    profiler_event_flag_none = 0,
    profiler_event_flag_start = 1 << 0,
    profiler_event_flag_end = 1 << 1,
    profiler_event_flag_collapsable = 1 << 2,
    profiler_event_flag_collapsed = 1 << 3
    */
    profiler_event_flag_none = 0,
    profiler_event_flag_start = 1,
    profiler_event_flag_end = 2,
    profiler_event_flag_collapsable = 4,
    profiler_event_flag_collapsed = 8
};

struct profile_event {
     u64 cycles;
     u32 id;
     /* u32 flags; */
     profiler_event_flag( 4 ) flags;
};

/* If flags contains:
5 it should display profiler_event_flag_start | profiler_event_flag_collapsable 
6 -> end | collapsable
13 -> start | collapsable | collapsed
14 -> end | collapsable | collapsed
*/


EDIT: I forgot to mention an issue. If a flag isn't defined in the enum, it display the hexadecimal value, but the byte order is wrong. Well I think it's wrong. For example it displays 0x05000000 instead of 0x00000005 for a decimal value of 5.

I triggered an assert while trying to create a global array. For example add "var event_strings[ 128 ]" outside of a struct and it should trigger.

What I was trying to do is to display profile_event.id as a string. The strings are defined in the meta struct, but they aren't an array (it's the block of memory displayed in the second loop in profile_meta). So what I wanted to do is create an global array containing the strings offsets and then use that array in the profile_event struct.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
/* I know there aren't more than 128 entries here,
but ideally I would like to use profile_meta.timeline_id_count
and profile_meta.event_id_count.*/
var timeline_strings[ 128 ]; /* or var u64 timeline_strings[ 128 ]; */
var event_strings[ 128 ]; /* or var u64 event_strings[ 128  ]; */

struct profile_meta {

    u32 timeline_id_count;
    u32 timeline_count;
    u32 event_id_count;
    u64 timeline_id_string_buffer_size;
    u64 event_id_string_buffer_size;
    u16 timeline_id_string_lengths[ timeline_id_count ];

    for ( var i = 0; i < timeline_id_count; i = i + 1 ) {
       timeline_strings[ i ] = current_address( );
       string( timeline_id_string_lengths[ i ] ) timeline_id_string;
    }

    u64 timeline_sizes_in_bytes[ timeline_count ];
    u32 timeline_ids[ timeline_count ];
    u16 event_id_string_lengths[ event_id_count ];

    for ( var i = 0; i < event_id_count; i = i + 1 ) {
        event_strings[ i ] = current_address( );
        string( event_id_string_lengths[ i ] ) event_id_string;
    }

    u64 timing_frequency;
    u64 fallback_cycles;
    u64 fallback_time;
    u8 platform;
    u8 windows_version;
    u8 windows_qpc_shift;
    u64 windows_qpc_bias;
    u64 windows_mul128_value;
    u64 windows_add_value;
    u32 linux_mult;
    u32 linux_shift;
};

struct profile_event {
     u64 cycles;
     u32 id;
     /* Can I reference "meta"  here ? Should I pass it as a parameter ? */
     @( external event_strings[ id ] ) string( meta.event_string_lengths[ id ] );
     u32 flags;
     // profiler_event_flag( 4 ) flags;
};


But at that point I feel that writing a program that does that in C might be more comfortable to me.
mrmixer
For example it displays 0x05000000 instead of 0x00000005 for a decimal value of 5.

As I have recently spent most of my time on the GUI this is a feature I forgot to port back to command line.

This was originally an "intended feature", it displays it as if it was declared as raw, and that is always in byte-order. But as you mentioned, for little endian it looks incorrect. In the GUI it's displayed as decimal. I decided to normalize these cases for next release to display it in "display order" hex (as if declared with u(size, hex). This normalization change also adds bitwise-or:ed display of enums to command line version - although I still haven't found a good way to edit these styles of enums without more information, at least the graphical display is proper on both GUI and command line.

I also usually use shift operators in C/C++ to define enum values, but as BEdit doesn't (yet) support expressions for enums I use binary literals as a work-around. That is
1
2
3
4
5
6
enum
{
    value_a = 0b001,
    value_b = 0b010,
    value_c = 0b100,
};
but I agree that a shift would be preferable.


The second part got me thinking, does BEdit actually already have pointer-ish logic? (I don't think it helps you now, because BEdit is currently lacking array / repeating member access - but I got curious.)

You could do something like a pointer:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
struct Bar { u(4) value; ... };

struct Baz(var barAddress)
{
    /* hidden <- not yet supported */ @(external barAddress) Bar barRef;
    string(barRef.value) str;
};

struct Foo
{
    var barAddress = current_address();
    Bar bar;
    Baz baz(barAddress) baz;
};


That would functionally be like a pointer but the actual code generated would be very inefficient compared to a normal pointer.

Next version also includes the layout improvement for array indices for command line version (as well as improvements and bug fixes for GUI and command line), I plan to update both command line and GUI tonight or tomorrow.