Correct usage of os.path and os.join

I’m trying to write a handler that acts on files within various subdirectories, but while my script can see these files, it cannot do anything with them because it fails to assemble their paths.

The problematic part comes from this loop:

for (path, dirs, files) in os.walk("data/"):
    for image in files: 
        #do something to the image

Now, the script works in the first level of the data directory, but fails to work on data‘s subdirectories.

I tried using os.path.join():

for (path, dirs, files) in os.walk("data/"):
    print os.path.join(path, dirs)

But that throws the following:

Traceback (most recent call last):
  File "bench.py", line 26, in <module>
    print os.path.join(path, dirs)
  File "/usr/lib/python2.7/posixpath.py", line 75, in join
    if b.startswith('/'):
AttributeError: 'list' object has no attribute 'startswith'

In short, what I’d like to do is assemble a path from data to the image which includes data‘s subdirectories. What’s the best practice for doing this?

Best answer

I think that you want to join path with file for each file in files

for path,dirs,files in os.walk('data/'):
    for f in files:
        fname = os.path.join(path,f)
        assert(os.path.exists(fname))

dirs is a list of directories which are in the directory path. You can actually modify dirs in place to prevent os.walk from walking into into certain directories (neat!).