Hacker News new | past | comments | ask | show | jobs | submit login
std::source_location doesn't work if you need strings to be types (elbeno.com)
88 points by jandeboevrie on Nov 16, 2023 | hide | past | favorite | 76 comments



> You need to know the size at compile time. std::source_location::file_name() gives you a const char* – you don’t know the size of the string. And because it has to be a function argument, it can’t be constexpr.

At least this part is not entirely correct, the following compiles in C++20:

  consteval std::string_view filename(const std::source_location &loc = std::source_location::current()) {
    return loc.file_name();
  }
See [0]. I don't know how to go from here to a type, as the author wants, but I'll venture that there's some ungodly template metaprogramming hack that will get you there.

[0] https://godbolt.org/z/K6ejTr9cn


So, one issue is that source_location is not a valid non-type template parameter as it is not a structural type. But you can trivially make one yourself if you chose an upper bound on the filename (and function name) size.

The second issue is that it seems that default values for non-type template parameters are not evaluated at the instantiation location, but at the definition location, so you need to make the template parameter explicit. This is the best I could come up with:

   log<o>("hello", "world")  // at example.cpp:30
   
which prints[1]:

   app/example.cpp:30: hello world
[1] https://godbolt.org/z/z1eGc6cM4


Very, very nice!

Could even reuse the explicit template parameter for the verbosity level, like:

  log<I>(…);  // info
  log<D>(…);  // debug


Ooh, excellent Idea! Make it a feature from a limitation!


you can also do compile time strlen(), if you are ok with goofy variadic templates


Implementing a constexpr strlen() is trivial and I looks just like it's from your ancient C textbook, save for the "constexpr" keyword. No goofyness involved.

Or you use what the C++ standard library has to offer.

std::string_view(ptr).length(); or std::string(ptr).length(); or std::char_traits<char>::length(ptr); all work at compile time.


For application development the real big problem is that std::source_location::function_nam() bloats binary size massively and cannot be selectively disabled. It is common to want file/line which can be shared per-file, or are small in size, respectively. But function name is unique per function and cannot fold. If you use source location a lot, such as for logging, this adds up really fast.

We are seeing binary sizes increase by 30-50% just from enabling source location, even if we don’t want function_name. Sadly this means we need to turn it back off again and go back to __FILE__ and __LINE__.


Hum. Suboptimal. Possibly you could create your own source location with just file and name? Or maybe it's a job for a linker script to cleanup all those function_name strings?


How do you explain this increase? Do you have huge amounts of tiny functions that all reference std::source_location? (Genuinely asking -- I've never used source_location).


That is some of it. The function names also include a lot of template specializations (thanks to a newer release of MSVC). Between lots of functions and very long names it adds up quickly. The function names are all unique so there are no savings from string pooling.


`std::source_location` does not have to be a function argument.

It can also be used in an NSDMI, in which case it ends up being instantiated whenever the NSDMI is used, i.e. at every aggregate initialization site.

https://godbolt.org/z/v5rhjqdbY

Of course, that doesn't really allow you to do anything that you couldn't already do in the with a constexpr function (see below). In particular, such a struct can be used as a non-type template parameter, but a `template<NSDMI sloc = NSDMI{}> void foo();` will merely report the line where the template is defined, not where `foo` is called.

    // Results in a compile-time constant, so can be used everywhere where __LINE__ could be used.
    constexpr int line(std::source_location loc = std::source_location::current()) { 
       return loc.line();
    }
So std::source_location can fully replace __FILE__/__LINE__, but it cannot always replace the macro where __FILE__/__LINE__ were being used.


Huh, being able to see where a template was instantiated is niche but kind of interesting.


>But of course this is a macro, and means that your logging/assertion code has to be a macro.

No, only the code that references __FILE__ needs to be a macro. Usually what i see is a logging function that takes line and file as arguments, and then a macro that calls the function with __FILE__ and __LINE__.


If you expose a macro that then calls your function, what's the difference between what OP is saying?


OP seems to be saying that the logging code has to be in a macro. But it doesn't: it can be in a function that's called by a macro. It's the difference between having the logging code in a macro:

    #define LOG(...)\
        do{\
            if(g_logging_enabled){\
                printf("%s:%d: ",__FILE__,__LINE__);\
                printf(__VA_ARGS__);\
            }\
        }while(0)
And having the logging code in a function called by a macro:

    extern void Log(const char*file,int line,const char*fmt,...);
    #define LOG(...) (g_logging_enabled?Log(__FILE__,__LINE__,__VA_ARGS__):(void)0))
With the Log function being along these lines:

    void Log(const char*file,int line,const char*fmt,...){
        printf("%s:%d: ",file,line);
        va_list v;
        va_start(v,fmt);
        vprintf(fmt,v);
        va_end(v);
    }
(This logging code is deliberately simple; in a realistic system, this would be pages of junk dealing with all manner of log categories, log levels, multiple enable flags and their overrides, log target handling, and whatnot. Absolutely the sort of thing that is 0 fun to keep track of in a macro, even before you consider the fact that it's a ton of code to be pasting everywhere you want to log a string.)


I understand OP but I haven't seen a single logging implementation that would inline the entire implementation in a macro and I suspect the author of the blog post hasn't either.


If it's a macro, then the preprocessor expands the macro at the callsite (which then expands the __FILE__ and __LINE__ macros in that location).

If you use a function, then the __FILE__ and __LINE__ expand inside that function and point to the location inside the function rather than the callsite of the function.


Which is why you trampoline the function call through a macro to get the expansion, the logging logic can still be inside the function.


I agree and that's how I've always handled it. But the OP is trying to solve this purely without macros which is why it is an issue.


Regarding the problem of strings in embedded, I once did a pretty nifty thing in a logging framework I built for an embedded system in an elevator (Cortex M0 I think it was) with very little flash for the binary and for logging. I had a logging macro which took in a string to log together with some arguments (like printf). The macro expanded to add an attribute to the string constant to put it in a special section I created with a linker script. Then in the macro what it actually logged was the memory offset in the section together with the arguments. So that way the log was extremely slim. As an extra bonus, I then stripped the special section with the strings from the binary and had an offline script translate the logged memory offsets and attributes to strings.


Google's embedded library Pigweed [1] industrializes exactly this approach and has some unholy macro nonsense to make the string hashing work in pure C as well.

[1]: https://pigweed.dev/pw_log_tokenized/


Out of curiosity, why would you have to be so efficient in an elevator?

An elevator I imagine costs big $$$, has no lack of power, and plenty room for electronics.


Same reason a top of the line PC tower still has crappy stamped metal frame and crappy injection molded plastic trims.

The main contractor of the elevator unit is a mechanical engineering company, they contract out the control unit to the lowest bidder.


Nope. This was all in house. But we are talking at home lifts for the rich and lazy (and sometimes elderly). And there was quite a bit of electronics in it with processors at each floor, on the control panel and for the motor controller and so on. So the cost added up. But yeah, I agree, some more flash would have been worth it.


Defmt is a rust library that does exactly this. This has become almost a standard in embedded rust.

There is even a c++ client-side implementation: https://github.com/Javier-varez/Postform/

...which uses macros for logging ;)


You can do this (if you want to be evil!)

struct fstr { constexpr fstr(const char* s) { for (length = 0; s[length] ; length++) data[length] = s[length]; }

    char data[256] = {0};
    size_t length = 0;
};

template <fstr sz> struct log { log() { std::cout << std::string_view{ sz.data, sz.length } << std::endl; } };

int main() { log<fstr(std::source_location::current().file_name())> _; }


Broken is a strong word.

There are a lot of language features that aren’t suitable to specific contexts like embedded or realtime.

It’d be interesting to see if this concern was raised during discussion of source_location and whether a rationale was established, but I wouldn’t go so far as to say it’s broken when it just happens not to be suitable to a specific, narrow implementation context.


Everyone knows that the best way to get something done on the Internet is with an inflammatory title and a potentially incorrect contention.


Cunningham's Law: The best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer.


The article calls it "an inflammatory title and a potentially incorrect contention" and those happen to be the two things that HN asks to be edited out of titles ("unless it is misleading or linkbait." - https://news.ycombinator.com/newsguidelines.html), I've taken a crack at shrinking the claim down to the article's actual complaint.

If there's a better (more accurate and neutral) title, preferably using representative language from the article itself, we can change it again.


This is doable, but it doesn't have great ergonomics. Here https://godbolt.org/z/1z4qP8esb you can find a general-purpose function to_static_array(), which can turn any dynamically sized input range into a std::array at compile time. The string is stored as a std::array NTTP and not contained in the binary.

But there is a big catch: the range must be returned from a constexpr function. So to get the correct source_location object you always need to spell out the entire `to_static_array<100>([] { return current_file_name(); }` expression where you need it.

So while this does work for source_location::file_name() it doesn't for source_location::function_name() as that will always result in the name of the lambda.

So it's only a partial solution. And has terrible ergonomics for this particular use case. so I don't think we have completely eliminated the need for macros just yet.

But other than that it's a great tool if you want to do some constexpr computations with std::vector or std::string and then turn the result into a constant-sized compile-time array, optionally baking it into the binary.


If the blog post author showed us the code they want to write (but doesn't work), it might be easier to understand what they mean.


They did link an hour long conference talk (which I haven't watched), that I assume explains in more detail the kind of logging system that they're working with: https://www.youtube.com/watch?v=Dt0vx-7e_B0

I think that basically they want a class template magic_functor that can be invoked as below

    magic_functor<std::source_location::file_name()>::type
And then they could get a compile-time number corresponding to the file_name by doing:

    using T = magic_functor<std::source_location::file_name()>::type;
    int x = my_logging_lib_string_to_number<T>();
But unfortunately it simply doesn't work, because magic_functor is unwritable, because a `const char *` doesn't remember its size like a `const char[]` does.


If the `const char*` is a compile-time value, you can get its size via `std::char_traits<char>::length()`.

Here I could get `magic_functor<std::source_location::file_name()>::type` to work on MSVC: https://godbolt.org/z/Ph8dd78fa gcc doesn't accept this code, but that seems more like an implementation problem than a specification problem?


There are a lot of restrictions on what a pointer-type template parameter can point to. In particular it needs to have linkage (which for example string literals do not have). Whether the results of constexpr functions must, can or can't have linkage is beyond my paygrade, but I suspect this is an MSVC extension.

edit: linkage appears to propagate through constexpr. So it depends on how ultimately the original string is generated. I don't know what guarantee the standard gives (cppreference doesn't mention anything and I can't be bothered to look at the standard).


I liked this better, it uses some macros but it's easily readable and compresses the logs nicely.

https://youtu.be/FyJI4Z6jD4w


What do you mean? They do. They want to replace the name of the file with a number at compile time so that the logs are tiny. The way they do that now is that they pass __FILE__ to a char parameter pack expansion to generate a unique type that erases to nothing but a number at compile time and extract the unique id of that type using nm but that doesn’t work with source_location because it has char* as the type.

I wonder if there’s some kind of workaround where you wrap source_location so that it outputs the hash of the file name using a know consteval hashing function. Then you have your unique type (just assign the hash value to your type) at the cost of some extra overhead of needing to hash every filename in your build.


But it's quite unclear what they are currently doing. Literally translating what you are describing:

   template<char... FILE>
   struct magic { };

   magic<__FILE__> m;
This does not compile with gcc, and std::source_location fails to work for exactly the same reason. I don't see any additional problems with std::source_location that don't equally apply to __FILE__.


That's a prose description, not code.


I thought one of the basic skills of programming is to turn a prose description into code...


And one of the basic problems of programming is that the resulting code is almost certainly wrong. Which is why it's always safest to look at the specific code in question, rather than a description of what it's supposed to do.


Doing that takes a lot of time. It would help the article to spell it out somewhat better.

Or at least copy a couple slides from the video.


I don’t think I’ve claimed otherwise.


I imagine they want something like this [0] but only with even less string pointers, using e.g. hashes or globally assigned numbers instead.

[0] https://www.embeddedrelated.com/showarticle/518.php



I think I don't understand the post or problem. How can't you use `constexpr` computation to get the file name at compile time?

  #include <stdio.h>
  
  #include <array>
  #include <source_location>
  #include <string_view>
  
  template<size_t N> constexpr auto GetFilename(char const name[]) {
    std::array<char, N + 1> a = {};
    std::char_traits<char>::copy(&a[0], name, N);
    return a;
  }
  
  template<auto Value = GetFilename<
             std::char_traits<char>::length(std::source_location::current().file_name())
           >(std::source_location::current().file_name())>
  using SL = std::integral_constant<decltype(Value), Value>;
  
  int main() {
      printf("%s\n", &SL<>::value[0]);
  }


Me neither. Unless I am misunderstanding the issue in the post, a simple

  static constexpr std::string_view filename = std::source_location::current().file_name();
  static_assert(filename == "whatever_the_name_of_file_is");
  static_assert(filename.length() == 28);
will do. Both the filename and filename length is available in the compile-time.


In your example, the source location will be for the template definition, instead of the function call. But yes, although you can't use std::source_location directly, you can roll your own.

See my other comments elsethread.


Ahhh, I understand now. Thanks!


I figured out how to do something like this once without macros. It took me more days than I care to admit. It is something that sounds like it should be easy but is unreasonably difficult and requires hiding code horrors behind an elegant API.


What was your basic approach?


This shows what I perceive as a problem in the culture around C++. Tons of suggestions how to write some trivial, probably unimportant piece of code in an "optimal" way. There is a perceived problem, macros, but then 90% of those suggestions are worse.

A good, straightforward solution, using existing tech, is hidden in some leaf comment, or not actually there explicitly. I would guess >>50% of the posters don't know that this is doable. Let me type it out:

    void actual_logging_function(std::string_view filename, int line, std::string_view msg)
    {
        printf("In %.*s.%d: %.*s\n", filename.length(), filename.data(), line, msg.length(), msg.data());
    }

    #define logging_function(msg) actual_logging_function(std::string_view(__FILE__), __LINE__, (msg))
There is plain C version of this too, just replace std::string_view(__FILE__) by sth like my_string_view(__FILE__, sizeof __FILE - 1). Obviously the length also be passed directly to the logging func.

I get it. Macros are terrible. But can we stop coming up with "solutions" that are even more terrible, just on other axes?

And really, the initial "problem" was just that the string length wasn't a compile-time constant. I would venture out and claim that this isn't an actual problem, unless "logging" is confused with "high-performance string formatting". If an application is logging at a rate where a simple "%s" becomes an issue, isn't there something else that is broken? And for context - typically a logging call also has to do other formatting that can't be moved to compile time, such as formatting integers.


No, it is you that is missing the point.

Your example can be written in C++ as this:

   void logging_function(std::string_view msg, std::source_location l = std::source_location::location()) {
         // log here
   }
No need for macros, it just work in almost all scenarios.

But this solution (and your __FILE__) will add a bunch of strings (representing the file names) into the read only section of the final binary so that they can be available at runtime. Apparently this is undesirable for the embedded scenario the author has in mind. So in this thread we were looking for a way to encode the source_location as a template parameter (which the author claims it can't be done) instead so that it is part of the symbol and can be stripped out.

It is a very niche need, it just turned out to be a fun game to play.


Well I have a history of missing the point for sure :-). So let me ask further, where should the string be stored if not in the read only section?

As far as I understood the post, the issue with your above code is that the std::source_location::location() isn't able to use the size (length) of the filename string at compile time (*). While with _FILE_, the size is known -- it's even in its type.

So if you want to play template or constexpr tricks, you probably can do what you have in mind using __FILE__. Just be aware that a (pointer,length) pair (like string_view) or a plain zero-terminated string pointer will have the size information available at runtime but not compile time.

So not saying that I didn't miss anything (I probably did), and not to mean any offense, but maybe you're also one of those >>50% of people that I mentioned? :D

(*) which, as you and I already agree, is probably not really an issue. It's more a "fun" game to play -- the kind of game that a lot of C++ people like to play, and then sell it as "optimal solution" or "zero-cost abstraction".


In the final binary, it would not be stored at all. The idea is that if you have can use the location as a template parameter, you can generate an unique symbol for it (say, a function) and log that address instead, stripping the actual function name from the binary. When reading the log you can map the function address to the function name and recover the source_location from that (but I'm not familiar with the actual process). The key is that the mapping need not be stored in the binary.

As shown else thread, you can easily compute the size of std::source_location::current().file_name() at compile time. The only limitation is that neither source_location nor the std::source_location::current().file_name() pointer themselves can be used directly as non-type template parameters. But there are ways around that.

And yes, you can do the same with __FILE__ and __LINE__, but you need to wrap them in a macro for usability. The question is whether there is a solution that doesn't use macros at all (I posted one elsethread).


Does anyone know what std::source_location::function_name() returns inside a lambda? Is it the outer function name, or something like this:

"foo()::<lambda()#1>"

I remember writing some constexpr expressions to strip the lambda portion (which had a guid). Is there now an easier way?


I just tested it:

    1  #include <source_location>
    2  #include <iostream>
    3  int main() {
    4      auto c = [](std::source_location loc1 = std::source_location::current()){
    5          std::source_location loc2 = std::source_location::current();
    6          std::cout << loc1.function_name() << ":" << loc1.line() << "\n";
    7          std::cout << loc2.function_name() << ":" << loc2.line() << "\n";
    8      };
    9      c();
    10     return 0;
    11 }
As eklitzke said, this results in (using gcc):

    int main():9
    main()::<lambda(std::source_location)>:5


Thanks!

Is there a way to extract the undecorated (?) function name? ("main" in this case, without the return type or args)

Maybe a helper class or formatter?


The actual name format is implementation defined. Here's the output from clang 17:

    int main():9
    auto main()::(anonymous class)::operator()(std::source_location) const:5
so your formatter/helper would need to be compiler-aware, although to extract “main” in both cases I suppose you can just grab everything between the first whitespace and (, although I assume both types would contain specifiers like const where included in the return type. I don’t know what MSVC outputs.


Thanks again, I'll have to give it a try once we get on C++20


If you make the lambda take the source location as a default parameter initialized to `std::source_location::current()` (which is the typical usage) then the default argument is created in the scope of the caller, so it would expand to the name of the function (and line number, etc.) calling the lambda rather than the lambda itself.

If you want to literally capture the name of the lambda and the line number in the lambda you'd create the source_location within the lambda instead.


What kind of application are you running that you have to care about string constants in the program binary? I can't imagine a situation where you have so little flash you can't store strings, but you have enough storage and CPU cycles to keep logs or run UART. IMO, running that close to the limit of your chip's resources is a bad thing.

This post reads to me like complaining that your hammer ruined your screw.


the post involves actual Intel firmware engineers solving actual problems for their embedded devices; one example of such a device is a power management controller within a chipset.

it is quite astounding to me that you're going to try to claim that Intel is using its own chips in an "incorrect" manner.

the presentation might help: https://youtu.be/Dt0vx-7e_B0?si=QLfI5-9LHh5ehb5a&t=326 where specifically the problem they have encountered is that the string constants end up roughly the size of their application code.


Yeah, there's no indication at all what the use case here is from this post.

And yes, I'd still argue this is poor design, it doesn't matter who designed it. We're all very aware that Intel is not immune to bad design decisions. It's quite astounding to me that you're claiming Itanium was a good design. /eyeroll


To be fair, it's not immediately obvious that this blog post is authored by an Intel engineering lead. I had to track down his LinkedIn page to find that out.


The issue with source_location is that it cannot be used in a macro, what I generally want is something like stack_trace::parent( ) so that assert like macros can report the location of their caller and error, not the location of the macro


something like stack_trace::parent would be hard to do in a modern toolchain, since we now have people deploying code to environments like WebAssembly that prohibit stackwalking

In C#/.NET's case, even though stackwalking is available in the API, it is preferred to push the location info in from outside and there are mechanisms to automate it (CallerFilePath, etc)


It's part of C++23, https://en.cppreference.com/w/cpp/utility/basic_stacktrace. It's not parent, and I think I am wrong that that is needed, just the current position in the stack, so assuming the trace isn't empty(), it's like stack_trace::current( )[0]


Hmm, is there a way to force particular string constants to be written to particular ELF sections?

Because if so, you could just ... not dereference the pointer, and do object-file magic to make the section not actually mapped.


  string_constant<'H', 'e', 'l', 'l', 'o'>
Sometimes I see C++ code, and I feel deeply sad for the people that are subjected to it.


You don't have to do this, here's an alternative approach:

template<std::size_t N> struct fstr: public std::array<char, N + 1> { using std::array<char, N + 1>::array;

    constexpr fstr(const char* str) noexcept : std::array<char, N + 1>() {
        for (size_t i = 0; i < N; i++) (*this)[i] = str[i];
    }
};

template<std::size_t N> fstr(const char (&)[N]) noexcept -> fstr<N - 1>;

template <fstr sz> struct log { log() { std::cout << std::string_view{ sz.data(), sz.size() } << std::endl; } };

int main() { log<"Hello, world!"> _; }


They're not actually writing these template instantiations, you would write regular strings in your code. The whole point is that this is for a logging library that does compile-time trickery to replace strings with integer labels. They have some compile-time tricks to convert the strings passed in logging macros to these template instantiations. I'm guessing the way it works is that since each of these template instantiations represents a unique type they can convert each type to a known numeric code, and then have a map of codes to the original strings to read the log data. But the point is you'd only have to deal with these weird templates if you're actually implementing or debugging the library, which hopefully you'd only have to do rarely (ideally just once!).


That is code from people not used to modern C++, as you can see from most replies here.


A little video of a guy playing with it https://www.youtube.com/watch?v=TAS85xmNDEc


That's an interesting problem. I'm still realizing how different embedded software constraints are from the desktop.


I don't think this post is actually that representative of the field.

Fretting over the binary size of string constants is not something I've ever seen. Generally if you don't have enough flash to store strings, you also don't have a use for strings. You won't be logging to external memory or to serial. If you have that little flash, you generally also have very few CPU cycles. Logging of any sort would take up a huge percentage of your CPU cycles and you won't have anything left to run the application. Not to mention the RAM, which is usually much much smaller than flash.

The problem posed in TFA isn't really a problem. If you need continual logging, you must use a chip with more CPU, which almost always means more memory. If you have a chip with 512B of flash and 128B of RAM, you don't need logging.

Exceptions apply, of course, but this is a case of wrong tool for the wrong problem.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: