You are on page 1of 16

PATH_MAX simply isn't

Many C/C++ programmers at some point may run into a limit known as PATH_MAX. Basically, if you have to keep track of paths to files/directories, how big does your buffer have to be? Most Operating Systems/File Systems I've seen, limit a filename or any particular path component to 255 bytes or so. But a full path is a different matter. Many programmers will immediately tell you that if your buffer is PATH_MAX, or PATH_MAX+1 bytes, it's long enough. A good C++ programmer of course would use C++ strings (std::string or similar with a particular API) to avoid any buffer length issues. But even when having dynamic strings in your program taking care of the nitty gritty issue of how long your buffers need to be, they only solve half the problem. Even a C++ programmer may at some point want to call the getcwd() or realpath() (fullpath() on Windows) functions, which take a pointer to a writable buffer, and not a C++ string, and according to the standard, they don't do their own allocation. Even ones that do their own allocation very often just allocate PATH_MAX bytes. getcwd() is a function to return what the current working directory is. realpath() can take a relative or absolute path to any filename, containing .. or levels of /././. or extra slashes, and symlinks and the like, and return a full absolute path without any extra garbage. These functions have a flaw though. The flaw is that PATH_MAX simply isn't. Each system can define PATH_MAX to whatever size it likes. On my Linux system, I see it's 4096, on my OpenBSD system, I see it's 1024, on Windows, it's 260. Now performing a test on my Linux system, I noticed that it limits a path component to 255 characters on ext3, but it doesn't stop me from making as many nested ones as I like. I successfully created a path 6000 characters long. Linux does absolutely nothing to stop me from creating such a large path, nor from mounting one large path on another. Running getcwd() in such a large path, even with a huge buffer, fails, since it doesn't work with anything past PATH_MAX. Even a commercial OS like Mac OS X defines it as 1024, but tests show you can create a path several thousand characters long. Interestingly enough, OSX's getcwd() will properly identify a path which is larger than its PATH_MAX if you pass it a large enough buffer with enough room to hold all the data. This is possible, because the prototype for getcwd() is: char *getcwd(char *buf, size_t size);

So a smart getcwd() can work if there's enough room. But unfortunately, there is no way to determine how much space you actually need, so you can't allocate it in advance. You'd have to keep allocating larger and larger buffers hoping one of them will finally work, which is quite

retarded. Since a path can be longer than PATH_MAX, the define is useless, writing code based off of it is wrong, and the functions that require it are broken. An exception to this is Windows. It doesn't allow any paths to be created larger than 260 characters. If the path was created on a partition from a different OS, Windows won't allow anything to access it. It sounds strange that such a small limit was chosen, considering that FAT has no such limit imposed, and NTFS allows paths to be 32768 characters long. I can easily imagine someone with a sizable audio collection having a 300+ character path like so: "C:\Documents and Settings\Jonathan Ezekiel Cornflour\My Documents\My Music\My Personal Rips\2007\Technological\Operating System Symphony Orchestra\The GNOME Musical Men\I Married Her For Her File System\You Don't Appreciate Marriage Until You've Noticed Tax Pro's Wizard For Married Couples.Track 01.MP5" Before we forget, here's the prototype for realpath: char *realpath(const char *file_name, char *resolved_name); Now looking at that prototype, you should immediately say to yourself, but where's the size value for resolved_name? We don't want a buffer overflow! Which is why OSs will implement it based on the PATH_MAX define. The resolved_name argument must refer to a buffer capable of storing at least PATH_MAX characters. Which basically means, it can never work on a large path, and no clever OS can implement around it, unless it actually checks how much RAM is allocated on that pointer using an OS specific method - if available. For these reasons, I've decided to implement getcwd() and realpath() myself. We'll discuss the exact specifics of realpath() next time, for now however, we will focus on how one can make their own getcwd(). The idea is to walk up the tree from the working directory, till we reach the root, along the way noting which path component we just went across. Every modern OS has a stat() function which can take a path component and return information about it, such as when it was created, which device it is located on, and the like. All these OSs except for Windows return the fields st_dev and st_ino which together can uniquely identify any file or directory. If those two fields match the data retrieved in some other way on the same system, you can be sure they're the same file/directory. To start, we'd determine the unique ID for . and /, once we have those, we can construct our loop. At each step, when the current doesn't equal the root, we can change directory to .., then scan the directory (using opendir()+readdir()+closedir()) for a component with the same ID. Once a matching ID is found, we can denote that as the correct name for the current level, and move up one.

Code demonstrating this in C++ is as follows:


bool getcwd(std::string& path) { typedef std::pair<dev_t, ino_t> file_id; bool success = false; int start_fd = open(".", O_RDONLY); //Keep track of start directory, so can jump back to it later if (start_fd != -1) { struct stat sb; if (!fstat(start_fd, &sb)) { file_id current_id(sb.st_dev, sb.st_ino); if (!stat("/", &sb)) //Get info for root directory, so we can determine when we hit it { std::vector<std::string> path_components; file_id root_id(sb.st_dev, sb.st_ino); while (current_id != root_id) //If they're equal, we've obtained enough info to build the path { bool pushed = false; if (!chdir("..")) //Keep recursing towards root each iteration { DIR *dir = opendir("."); if (dir) { dirent *entry; while ((entry = readdir(dir))) //We loop through each entry trying to find where we came from { if ((strcmp(entry->d_name, ".") && strcmp(entry->d_name, "..") && !lstat(entry->d_name, &sb))) { file_id child_id(sb.st_dev, sb.st_ino); if (child_id == current_id) //We found where we came from, add its name to the list { path_components.push_back(entry->d_name); pushed = true; break; } } } closedir(dir); if (pushed && !stat(".", &sb)) //If we have a reason to contiue, we update the current dir id { current_id = file_id(sb.st_dev, sb.st_ino);

} }//Else, Uh oh, can't read information at this level } if (!pushed) { break; } //If we didn't obtain any info this pass, no reason to continue } if (current_id == root_id) //Unless they're equal, we failed above { //Built the path, will always end with a slash path = "/"; for (std::vector<std::string>::reverse_iterator i = path_components.rbegin(); i != path_components.rend(); ++i) { path += *i+"/"; } success = true; } fchdir(start_fd); } } close(start_fd); } } return(success);

Before we accept that as the defacto method to use in your application, let us discuss the flaws. As mentioned above, it doesn't work on Windows, but a simple #ifdef for Windows can just make it a wrapper around the built in getcwd() with a local buffer of size PATH_MAX, which is fine for Windows, and pretty much no other OS. This function uses the name getcwd() which can conflict with the built in C based one which is a problem for certain compilers. The fix is to rename it, or put it in its own namespace. Next, the built in getcwd() implementations I checked only have a trailing slash on the root directory. I personally like having the slash appended, since I'm usually concatenating a filename onto it, but note that if you're not using it for concatenation, but to pass to functions like access(), stat(), opendir(), chdir(), and the like, an OS may not like doing the call with a trailing slash. I've only noticed that being an issue with DJGPP and a few functions. So if it matters to you, the loop near the end of the function can easily be modified to not have the trailing slash, except in the case that the root directory is the entire path. This function also changes the directory in the process, so it's not thread safe. But then again, many built in implementations aren't thread safe either. If you use threads, calculate all the paths you need prior to creating the threads. Which is probably a good idea, and keep using path names based off of your absolute directories in your program, instead of changing directories during the main execution elsewhere in the program. Otherwise, you'll have to use a mutex around the call, which is also a valid option.

There could also be the issue that some level of the path isn't readable. Which can happen on UNIX, where to enter a directory, one only needs execute permission, and not read permission. I'm not sure what one can do in that case, except maybe fall back on the built in one hoping it does some magical Kernel call to get around it. If anyone has any advice on this one, please post about it in the comments. Lastly, this function is written in C++, which is annoying for C users. The std::vector can be replaced with a linked list keeping track of the components, and at the end, allocate the buffer size needed, and return the allocated buffer. This requires the user to free the buffer on the outside, but there really isn't any other safe way of doing this. Alternatively, instead of a linked list, a buffer which is constantly reallocated can be used while building the path, constantly memmove()'ing the built components over to the higher part of the buffer. During the course of the rest of the program, all path manipulation should be using safe allocation managing strings such as std::string, or should be based off of the above described auto allocating getcwd() and similar functions, and constantly handling the memory management, growing as needed. Be careful when you need to get any path information from elsewhere, as you can never be sure how large it will be. I hope developers realize that when not on Windows, using the incorrect define PATH_MAX is just wrong, and fix their applications. Next time, we'll discuss how one can implement their own realpath(). 36 comments: Dan said... Hey, good to see you're finally back from your extended vacation. Bet the kids loved it, I know I always loved Disney World as a kid (hope they didn't make you go on "It's a Small World" too many times, that gets annoying after a while). Anyway, what you said about windows reminds me of a trick I used in high school to hide files on the school's network. I'd basically create a path as long as it would allow, put my games or whatever other forbidden files in there, and move the entire path once more into a new directory. Then when they tried to see what was in there, all they'd get is a recurrence of "New Folder/New Folder/New Folder/"etc. until they couldn't open it. It also wouldn't delete IIRC, and they weren't smart enough to realize to move it one level up, which is, of course, the method by which I would access my files there. As for your wife's filesystem, what's she running? ReiserFS V5? Ext7? FAT4096? It's gotta be something cool if you're ripping mp5s. The problem with PATH_MAX though, is that it's like a great sports play. You can't just rush into the score zone, you'd get a buffer overflow! But rather than implementing it sanely in a manner that you pass a buffer and a size, they decided to make their own

number that has no relation whatsoever to the actual max path length. Sure, you could say something like "no one could possibly need a path longer than X", but we saw how well that worked when Gates Almighty stated that "640K ought to be enough for anybody". You just never know how much of anything will be enough for someone, and therein lies the flaw in many aspects of computing today. In closing, stupid people suck. November 6, 2007 8:40 AM L3thal said... great topic , great explanation :) December 2, 2009 7:44 AM fantastico said... Windows allows approx 32k Unicode chars for the whole concatenated path, so long as: 1. You call the Unicode ('W') APIs rather than the OEM ('A') ones, eg CreateFileW; AND 2. You prepend the magic string '\\?\' to your path; AND 3. Your path is absolute or UNC, rather than relative. I know this is wierd and sounds unlikely, but I've personally written a test program to exercise this bizarre feature. MS documentation here: http://msdn.microsoft.com/en-us/library/aa365247%28VS.85%29.aspx#maxpath For extra kicks, you can also do: \\?\UNC\myhostname\mysharename\my\big\long\sequence\of\subdirs January 18, 2010 7:51 PM fantastico said... Dan: the problem with PATH_MAX is not stupidity of the designers. It's the history of UNIX. Please don't accuse the designers of stupidity without doing some research first. Modern *NIXes support multiple filesystem types simultaneously, each one with its own values for its own limits of various kinds. Worse, most modern *NIXes support something like loadable kernel filesystem modules. So now entire types of filesystem can come and go at the whim of the sysadmin. Given this, the idea of a single, static max-length number is wrong, regardless of its value. But that does not mean that those who came up with the idea were stupid.

Back in the day, UNIX had none of these features. In those days, the PATH_MAX concept was: a) sufficient for then purposes; b) simple for people to code to (compare modern sysconf); b. Could be implemented efficiently on a PDP-11. These were the times when malloc() performance sucked, so it was to be avoided at almost any cost - including static buffers with no bounds checking. In closing, judgemental ignorant people suck, and they make themselves look silly when they spout off on other peoples' blogs. January 18, 2010 8:05 PM liangliang said... http://www.jersey-clothing.com March 10, 2010 12:49 AM eurodizi said... Greetings. very nice explanation, thank you for this information. diyet diyet listesi diyetler diyet yemekleri kilo verme zayflama salk hastalk tedavi kalori kalori cetveli kadn makyaj moda oyun oyunlar online oyunlar oyun oyna oyun indir araba oyunlar barbie oyunlar barbi full oyun bedava oyun ocuk oyunlar kz oyunlar aksiyon oyunlar strateji oyunlar zeka oyunlar film izle dizi izle indirmeden izle bedava izle tv izle lig tv izle seyret video izle sinema izle film izle dizi izle video izle sinema izle indirmeden izle bedava izle tv izle lig tv izle seyret Thanks... pitt brid said... Ralph Lauren (Ralph Lauren) Brand Identity: 1. Polo logo: This is Lauren (Ralph Lauren) the most famous symbol, selected from the Ralph Lauren Polo brand aristocratic LOGO, you can associate the origins of clothing he designed. 2.POLO shirt: by Lauren (Ralph Lauren) created by polo shirts, long and short in front of Yibai is playing polo for the charge when the forward movement and design. 3. Cotton long-sleeved shirt: This is almost polo ralph lauren Safe for men and women of classic style to go with a formal suit, narrow skirt, very American style.

4. American flag logo: cowboy wear the best expression of American spirit, so polo ralph lauren outlet online to represent the United States flag. POLO (Ralph Lauren) Outlet Website: http://www.ralphlaurenonsale.com/ August 8, 2010 7:20 AM mirc34 said... amatr siki sikis siki izle porno izle porno August 15, 2010 1:13 PM sexy said... herve leger herve leger dress herve leger bandage dress herve leger sale herve leger dress sale Herve Leger Dresses Herve Leger Skirts August 19, 2010 2:24 AM sexy said... herve leger herve leger dress herve leger bandage dress herve leger sale herve leger dress sale August 19, 2010 2:24 AM sexy said... moncler moncler jackets Moncler Online moncler sale moncler clothing

Moncler Women Moncler Jacken Moncler Men Moncler Kids Moncler Jassen Moncler Uomo Moncler Vest Moncler Accessories moncler coats piumini moncler moncler branson moncler jas August 19, 2010 2:25 AM sexy said... moncler moncler jackets Moncler Online moncler sale moncler clothing August 19, 2010 2:25 AM sexy said... P90X p90x results p90x workout p90x reviews p90x dvd set p90x workout schedule p90x extreme home fitness p90x nutrition plan p90x exercise program p90x torrent p90x dvd p90x sale Shaun T INSANITY Workout 13 DVDs RevAbs ChaLEAN Brazil Butt Lift 10 Minute Trainer August 19, 2010 2:25 AM

sexy said... P90X p90x results p90x workout p90x reviews p90x dvd set p90x workout schedule p90x extreme home fitness p90x nutrition plan p90x exercise program p90x torrent p90x dvd p90x sale Shaun T INSANITY Workout 13 DVDs RevAbs ChaLEAN Brazil Butt Lift 10 Minute Trainer August 19, 2010 2:25 AM sexy said... P90X p90x results p90x workout p90x reviews p90x dvd set p90x workout schedule p90x extreme home fitness p90x nutrition plan p90x exercise program p90x torrent p90x dvd p90x sale Shaun T INSANITY Workout 13 DVDs RevAbs ChaLEAN Brazil Butt Lift 10 Minute Trainer August 19, 2010 2:26 AM Emir said...

kadinlar sitesi cilt bakm komik site komik videolar komedi sitesi komedi komik komik sicak site porno izle sikis sitesi siki izle sex sitesi sex izle kadinlar sitesi cilt bakm sikis videolari porno trk trk pornolar pornolar porno kadnlar iin kadn sal kadn sal gzellik srlar gzellik kz pornolar porno porno videolar porno izle siki videolar siki sex videolar sex teekkr ederim August 20, 2010 9:35 PM TRamp_ChiLDRen said... Hey, good to see you're finally back from your extended vacation. Bet the kids loved it, I know I always loved Disney World as a kid (hope they didn't make you go on "It's a Small World" too many times, that gets annoying after a while). Anyway, what you said about windows reminds me of a trick I used in and porno August 30, 2010 4:22 PM mirc34 said... bytc geciktirici azdrc September 5, 2010 1:05 PM mirc34 said... porno September 28, 2010 11:21 PM Musakaya55 said... shares, and issues a very nice site very good thanks a lot this site has always been like that admins will get t sohbet siteleri sohbet odalar bedava sohbet November 13, 2010 4:49 PM wild africa said... The Rank Herve Leger Dress fashion shows are often populated with celebrities and famous actresses dressed in Herve Leger bandage dresses.Famous fans, Lindsay Lohan, Rihanna and Kate Bosworth.

November 18, 2010 10:32 PM Kenneth said... Thanks for the info: I thought i could do something like that but i'm never sure how write protected these things are. we'll talk. P90x Workout Program February 1, 2011 1:40 AM wyuguy said... potty training. When the younger a person sees the older 1 is able to use the toilet by himself, this will motivate the younger 1 to imitate the action February 22, 2011 4:38 AM mirc34 said... izmir escort escort izmir bayan escort February 24, 2011 2:27 PM mktg said... Do you lovejewelry making blog, jewelry making tips is your best choice. jewelry DIY blog can make your jewelry unique. We are China online Wholesale Beads store, China Wholesale Beads is your best choice. China jewelry findings and jewelry making supplies for wholesale.We offer you Jewelry Supplies, Jewelry Findings, Jewelry Beads at the cheapest price. We are specialized in Wholesale Jewelry Supplies, Wholesale Jewelry Findings, Tibetan Style Beads, Tibetan Silver Beads, Tibetan Silver Charms, Wholesale Acrylic Beads, Wholesale Crystal Beads, Wholesale Jewelry Charms, Pandora Style Beads, Lampwork Beads, Metal Beads, Gemstone Beads, Wood & Nut Beads. Buy more get more discount. March 25, 2011 12:50 AM abercrombiefitch said... Nike Air Max 2011 Nike Air Max Shoes Nike Air Max Nike Air Max Wildwood Supreme Nike Air Max Turbulence Nike Air Max Skyline

Nike Air Max Goadome Nike Air Max Fitsole Nike Air Max Zenyth Nike Air Max Boots Nike Air Max Zoom Kobe Nike Air Max Tn Nike Air Max LTD Nike Air Max Presto Nike Air Max BW Nike Air Max 24 7 Nike Air Max 2010 Nike Air Max 2011 Nike Air Max 2009 Nike Air Max 95 Nike Air Max 91 Nike Air Max 87 Nike Air Max 180 Nike Air Max Griffey Max 1 Nike Air Max 90 Nike Air Max 1 Abercrombie and Fitch London Abercrombie & Fitch Clothes Vibram Shoes Cheap Air Max Nike Air Max Shoes May 28, 2011 7:06 PM burak said... free porn movies deutsch porno Kostenlose Pornofilme Deutsch Sex German Porn free porn Sex Tube Sex porn Free Porn And Sex Videos May 28, 2011 7:46 PM cikolatanet said... maras kalk gidelim mzik dinle pantolonun yok ben yine ey iki aldim May 31, 2011 12:07 PM

erdem said... porn video June 3, 2011 4:15 AM forex-- said... porno izle porno siki sex sex seks porno porno film June 4, 2011 4:26 PM vibrams said... The christian louboutin heels is designed for the ladies. Since the pumps was born, the ladies life become colorful. The christian louboutin evening pumps are the god's masterwork. Who invited thechristian louboutin pumps? Seldom people knew, but I think every lady would be grateful for him. Among the countless pumps, the christian louboutin peep toe is the most outstanding ones. The elegant pattern, the delicate style all mold the ladies perfect leg profile. Flowers in the spring of 2011 creeping, up from hair to clothes continue to footwear, have had a brilliant up. In such a glamor, spring and summer flowers now here. Romance is a woman's mood, exquisite flowers just right of expression in our gestures, the woman, how can we not love the romantic temperament so that they distributed the flowers do? 2011 flowers bloom will enjoy different poses! The Christian Louboutin 2011 Sandals also can adds the hright of the ladies, it bring surprise to the short lady. Especially the red sole of the louboutin heels, magic and sexy, many ladies are crazy. The red sole, the first feature of the Christian Louboutin stroe. June 7, 2011 8:27 PM said... Here you can purchase your favorite heart random items, and the Heart discount prices. Apparel & Clothing Apparel & Clothing wholesale wholesale Men's Clothing

Best Selling Men's Clothing cheap Men's Clothing Women's Clothing wholesale trendy womens clothing wholesale korean clothing Women's Jeans wholesale Women's Jeans cheap Women's Jeans wholesale Wholesale Affliction Affliction affliction clothing Christian Audigier wholesale Christian Audigier June 8, 2011 1:05 AM jelish said... Dear friends,Cheap Sale Louboutin online.All shoes elegant shoes is one of masterpiece from Christian Louboutin Platforms. When you buy yourself a pair of Christian Louboutin Thong Sandals shoes you allow yourself to benefit from the vast experience and expertise that this brand has collected over the years.Christian Louboutin Platform Sandals shoes is the personification of women,is their direct orgin of racial pride.Christian Louboutin Shoes shoes is the personification of women,is their direct orgin of racial pride.andChristian Louboutin Mensis a very distinctive design, its design reflects its style. You put on it, that means you have its style.Welcome to our Louboutin Shoes Sale . June 10, 2011 7:54 PM Friday said... The christian louboutin outlet heels can help you become sexy and elegant. Cheap christian louboutin sandals are regarded since the symbolic representation of attractive and elegant.Christian Louboutin Ankle BootsIt is especially suitable for the women who wear theChristian Louboutin Collection 2011 shoes at the first time.These Christian Louboutin Heels 2011 combine top quality, reasonable price and fashional design, which is your best choice Artist who promoted his collection of luxury women's Christian Louboutin Peep Toes in earlier 90s. No 1 can disregard the existence in the style world, World-famous red-colored soles and Christian Louboutin Pumps Sale are shaped features. However, you can by no means overlook the beautiful. You do not even need to go inside enviro Links to this post ,

, - . POSIX API , , getcwd ( ). . ... Posted by - at June 16, 2010 8:45 AM nment, as well as your slim, gorgeous and graceful legs may be effortlessly discovered in people's eyes! Welcome to our louboutin mall to buy discount louboutins heels. June 20, 2011 2:18 AM Post a Comment

You might also like