How can I list all files of a directory in Python and add them to a list
?
os.listdir()
will get you everything that's in a directory - files and directories.
If you want just files, you could either filter this down using os.path
:
from os import listdir
from os.path import isfile, join
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]
or you could use os.walk()
which will yield two lists for each directory it visits - splitting into files and dirs for you. If you only want the top directory you can break the first time it yields
from os import walk
f = []
for (dirpath, dirnames, filenames) in walk(mypath):
f.extend(filenames)
break
or, shorter:
from os import walk
filenames = next(walk(mypath), (None, None, []))[2] # [] if no file
pycruft
I prefer using the glob
module, as it does pattern matching and expansion.
import glob
print(glob.glob("/home/adam/*"))
It does pattern matching intuitively
import glob
# All files ending with .txt
print(glob.glob("/home/adam/*.txt"))
# All files ending with .txt with depth of 2 folder
print(glob.glob("/home/adam/*/*.txt"))
It will return a list with the queried files:
['/home/adam/file1.txt', '/home/adam/file2.txt', .... ]
adamk
os.listdir()
- list in the current directory
With listdir in os module you get the files and the folders in the current dir
import os
arr = os.listdir()
print(arr)
>>> ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']
Looking in a directory
arr = os.listdir('c:\\files')
glob
from glob
with glob you can specify a type of file to list like this
import glob
txtfiles = []
for file in glob.glob("*.txt"):
txtfiles.append(file)
glob
in a list comprehension
mylist = [f for f in glob.glob("*.txt")]
get the full path of only files in the current directory
import os
from os import listdir
from os.path import isfile, join
cwd = os.getcwd()
onlyfiles = [os.path.join(cwd, f) for f in os.listdir(cwd) if
os.path.isfile(os.path.join(cwd, f))]
print(onlyfiles)
['G:\\getfilesname\\getfilesname.py', 'G:\\getfilesname\\example.txt']
Getting the full path name with
os.path.abspath
You get the full path in return
import os
files_path = [os.path.abspath(x) for x in os.listdir()]
print(files_path)
['F:\\documenti\applications.txt', 'F:\\documenti\collections.txt']
Walk: going through sub directories
os.walk returns the root, the directories list and the files list, that is why I unpacked them in r, d, f in the for loop; it, then, looks for other files and directories in the subfolders of the root and so on until there are no subfolders.
import os
# Getting the current work directory (cwd)
thisdir = os.getcwd()
# r=root, d=directories, f = files
for r, d, f in os.walk(thisdir):
for file in f:
if file.endswith(".docx"):
print(os.path.join(r, file))
os.listdir()
: get files in the current directory (Python 2)
In Python 2, if you want the list of the files in the current directory, you have to give the argument as '.' or os.getcwd() in the os.listdir method.
import os
arr = os.listdir('.')
print(arr)
>>> ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']
To go up in the directory tree
# Method 1
x = os.listdir('..')
# Method 2
x= os.listdir('/')
Get files:
os.listdir()
in a particular directory (Python 2 and 3)
import os
arr = os.listdir('F:\\python')
print(arr)
>>> ['$RECYCLE.BIN', 'work.txt', '3ebooks.txt', 'documents']
Get files of a particular subdirectory with
os.listdir()
import os
x = os.listdir("./content")
os.walk('.')
- current directory
import os
arr = next(os.walk('.'))[2]
print(arr)
>>> ['5bs_Turismo1.pdf', '5bs_Turismo1.pptx', 'esperienza.txt']
next(os.walk('.'))
andos.path.join('dir', 'file')
import os
arr = []
for d,r,f in next(os.walk("F:\\_python")):
for file in f:
arr.append(os.path.join(r,file))
for f in arr:
print(files)
>>> F:\\_python\\dict_class.py
>>> F:\\_python\\programmi.txt
next(os.walk('F:\\')
- get the full path - list comprehension
[os.path.join(r,file) for r,d,f in next(os.walk("F:\\_python")) for file in f]
>>> ['F:\\_python\\dict_class.py', 'F:\\_python\\programmi.txt']
os.walk
- get full path - all files in sub dirs**
x = [os.path.join(r,file) for r,d,f in os.walk("F:\\_python") for file in f]
print(x)
>>> ['F:\\_python\\dict.py', 'F:\\_python\\progr.txt', 'F:\\_python\\readl.py']
os.listdir()
- get only txt files
arr_txt = [x for x in os.listdir() if x.endswith(".txt")]
print(arr_txt)
>>> ['work.txt', '3ebooks.txt']
Using
glob
to get the full path of the files
If I should need the absolute path of the files:
from path import path
from glob import glob
x = [path(f).abspath() for f in glob("F:\\*.txt")]
for f in x:
print(f)
>>> F:\acquistionline.txt
>>> F:\acquisti_2018.txt
>>> F:\bootstrap_jquery_ecc.txt
Using
os.path.isfile
to avoid directories in the list
import os.path
listOfFiles = [f for f in os.listdir() if os.path.isfile(f)]
print(listOfFiles)
>>> ['a simple game.py', 'data.txt', 'decorator.py']
Using
pathlib
from Python 3.4
import pathlib
flist = []
for p in pathlib.Path('.').iterdir():
if p.is_file():
print(p)
flist.append(p)
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speak_gui2.py
>>> thumb.PNG
With list comprehension
:
flist = [p for p in pathlib.Path('.').iterdir() if p.is_file()]
Alternatively, use pathlib.Path()
instead of pathlib.Path(".")
Use glob method in pathlib.Path()
import pathlib
py = pathlib.Path().glob("*.py")
for file in py:
print(file)
>>> stack_overflow_list.py
>>> stack_overflow_list_tkinter.py
Get all and only files with os.walk
import os
x = [i[2] for i in os.walk('.')]
y=[]
for t in x:
for f in t:
y.append(f)
print(y)
>>> ['append_to_list.py', 'data.txt', 'data1.txt', 'data2.txt', 'data_180617', 'os_walk.py', 'READ2.py', 'read_data.py', 'somma_defaltdic.py', 'substitute_words.py', 'sum_data.py', 'data.txt', 'data1.txt', 'data_180617']
Get only files with next and walk in a directory
import os
x = next(os.walk('F://python'))[2]
print(x)
>>> ['calculator.bat','calculator.py']
Get only directories with next and walk in a directory
import os
next(os.walk('F://python'))[1] # for the current dir use ('.')
>>> ['python3','others']
Get all the subdir names with
walk
for r,d,f in os.walk("F:\\_python"):
for dirs in d:
print(dirs)
>>> .vscode
>>> pyexcel
>>> pyschool.py
>>> subtitles
>>> _metaprogramming
>>> .ipynb_checkpoints
os.scandir()
from Python 3.5 and greater
import os
x = [f.name for f in os.scandir() if f.is_file()]
print(x)
>>> ['calculator.bat','calculator.py']
# Another example with scandir (a little variation from docs.python.org)
# This one is more efficient than os.listdir.
# In this case, it shows the files only in the current directory
# where the script is executed.
import os
with os.scandir() as i:
for entry in i:
if entry.is_file():
print(entry.name)
>>> ebookmaker.py
>>> error.PNG
>>> exemaker.bat
>>> guiprova.mp3
>>> setup.py
>>> speakgui4.py
>>> speak_gui2.py
>>> speak_gui3.py
>>> thumb.PNG
Examples:
Ex. 1: How many files are there in the subdirectories?
In this example, we look for the number of files that are included in all the directory and its subdirectories.
import os
def count(dir, counter=0):
"returns number of files in dir and subdirs"
for pack in os.walk(dir):
for f in pack[2]:
counter += 1
return dir + " : " + str(counter) + "files"
print(count("F:\\python"))
>>> 'F:\\\python' : 12057 files'
Ex.2: How to copy all files from a directory to another?
A script to make order in your computer finding all files of a type (default: pptx) and copying them in a new folder.
import os
import shutil
from path import path
destination = "F:\\file_copied"
# os.makedirs(destination)
def copyfile(dir, filetype='pptx', counter=0):
"Searches for pptx (or other - pptx is the default) files and copies them"
for pack in os.walk(dir):
for f in pack[2]:
if f.endswith(filetype):
fullpath = pack[0] + "\\" + f
print(fullpath)
shutil.copy(fullpath, destination)
counter += 1
if counter > 0:
print('-' * 30)
print("\t==> Found in: `" + dir + "` : " + str(counter) + " files\n")
for dir in os.listdir():
"searches for folders that starts with `_`"
if dir[0] == '_':
# copyfile(dir, filetype='pdf')
copyfile(dir, filetype='txt')
>>> _compiti18\Compito Contabilità 1\conti.txt
>>> _compiti18\Compito Contabilità 1\modula4.txt
>>> _compiti18\Compito Contabilità 1\moduloa4.txt
>>> ------------------------
>>> ==> Found in: `_compiti18` : 3 files
Ex. 3: How to get all the files in a txt file
In case you want to create a txt file with all the file names:
import os
mylist = ""
with open("filelist.txt", "w", encoding="utf-8") as file:
for eachfile in os.listdir():
mylist += eachfile + "\n"
file.write(mylist)
Example: txt with all the files of an hard drive
"""
We are going to save a txt file with all the files in your directory.
We will use the function walk()
"""
import os
# see all the methods of os
# print(*dir(os), sep=", ")
listafile = []
percorso = []
with open("lista_file.txt", "w", encoding='utf-8') as testo:
for root, dirs, files in os.walk("D:\\"):
for file in files:
listafile.append(file)
percorso.append(root + "\\" + file)
testo.write(file + "\n")
listafile.sort()
print("N. of files", len(listafile))
with open("lista_file_ordinata.txt", "w", encoding="utf-8") as testo_ordinato:
for file in listafile:
testo_ordinato.write(file + "\n")
with open("percorso.txt", "w", encoding="utf-8") as file_percorso:
for file in percorso:
file_percorso.write(file + "\n")
os.system("lista_file.txt")
os.system("lista_file_ordinata.txt")
os.system("percorso.txt")
All the file of C:\ in one text file
This is a shorter version of the previous code. Change the folder where to start finding the files if you need to start from another position. This code generate a 50 mb on text file on my computer with something less then 500.000 lines with files with the complete path.
import os
with open("file.txt", "w", encoding="utf-8") as filewrite:
for r, d, f in os.walk("C:\\"):
for file in f:
filewrite.write(f"{r + file}\n")
How to write a file with all paths in a folder of a type
With this function you can create a txt file that will have the name of a type of file that you look for (ex. pngfile.txt) with all the full path of all the files of that type. It can be useful sometimes, I think.
import os
def searchfiles(extension='.ttf', folder='H:\\'):
"Create a txt file with all the file of a type"
with open(extension[1:] + "file.txt", "w", encoding="utf-8") as filewrite:
for r, d, f in os.walk(folder):
for file in f:
if file.endswith(extension):
filewrite.write(f"{r + file}\n")
# looking for png file (fonts) in the hard disk H:\
searchfiles('.png', 'H:\\')
>>> H:\4bs_18\Dolphins5.png
>>> H:\4bs_18\Dolphins6.png
>>> H:\4bs_18\Dolphins7.png
>>> H:\5_18\marketing html\assets\imageslogo2.png
>>> H:\7z001.png
>>> H:\7z002.png
(New) Find all files and open them with tkinter GUI
I just wanted to add in this 2019 a little app to search for all files in a dir and be able to open them by doubleclicking on the name of the file in the list.
import tkinter as tk
import os
def searchfiles(extension='.txt', folder='H:\\'):
"insert all files in the listbox"
for r, d, f in os.walk(folder):
for file in f:
if file.endswith(extension):
lb.insert(0, r + "\\" + file)
def open_file():
os.startfile(lb.get(lb.curselection()[0]))
root = tk.Tk()
root.geometry("400x400")
bt = tk.Button(root, text="Search", command=lambda:searchfiles('.png', 'H:\\'))
bt.pack()
lb = tk.Listbox(root)
lb.pack(fill="both", expand=1)
lb.bind("<Double-Button>", lambda x: open_file())
root.mainloop()
PythonProgrammi
import os
os.listdir("somedirectory")
will return a list of all files and directories in "somedirectory".
sepp2k
A one-line solution to get only list of files (no subdirectories):
filenames = next(os.walk(path))[2]
or absolute pathnames:
paths = [os.path.join(path, fn) for fn in next(os.walk(path))[2]]
Remi
Getting Full File Paths From a Directory and All Its Subdirectories
import os
def get_filepaths(directory):
"""
This function will generate the file names in a directory
tree by walking the tree either top-down or bottom-up. For each
directory in the tree rooted at directory top (including top itself),
it yields a 3-tuple (dirpath, dirnames, filenames).
"""
file_paths = [] # List which will store all of the full filepaths.
# Walk the tree.
for root, directories, files in os.walk(directory):
for filename in files:
# Join the two strings in order to form the full filepath.
filepath = os.path.join(root, filename)
file_paths.append(filepath) # Add it to the list.
return file_paths # Self-explanatory.
# Run the above function and store its results in a variable.
full_file_paths = get_filepaths("/Users/johnny/Desktop/TEST")
- The path I provided in the above function contained 3 files— two of them in the root directory, and another in a subfolder called "SUBFOLDER." You can now do things like:
print full_file_paths
which will print the list:['/Users/johnny/Desktop/TEST/file1.txt', '/Users/johnny/Desktop/TEST/file2.txt', '/Users/johnny/Desktop/TEST/SUBFOLDER/file3.dat']
If you'd like, you can open and read the contents, or focus only on files with the extension ".dat" like in the code below:
for f in full_file_paths:
if f.endswith(".dat"):
print f
/Users/johnny/Desktop/TEST/SUBFOLDER/file3.dat
Johnny
Since version 3.4 there are builtin iterators for this which are a lot more efficient than os.listdir()
:
pathlib
: New in version 3.4.
>>> import pathlib
>>> [p for p in pathlib.Path('.').iterdir() if p.is_file()]
According to PEP 428, the aim of the pathlib
library is to provide a simple hierarchy of classes to handle filesystem paths and the common operations users do over them.
os.scandir()
: New in version 3.5.
>>> import os
>>> [entry for entry in os.scandir('.') if entry.is_file()]
Note that os.walk()
uses os.scandir()
instead of os.listdir()
from version 3.5, and its speed got increased by 2-20 times according to PEP 471.
Let me also recommend reading ShadowRanger's comment below.
SzieberthAdam
Preliminary notes
- Although there's a clear differentiation between file and directory terms in the question text, some may argue that directories are actually special files
- The statement: "all files of a directory" can be interpreted in two ways:
- All direct (or level 1) descendants only
- All descendants in the whole directory tree (including the ones in sub-directories)
When the question was asked, I imagine that Python 2, was the LTS version, however the code samples will be run by Python 3(.5) (I'll keep them as Python 2 compliant as possible; also, any code belonging to Python that I'm going to post, is from v3.5.4 - unless otherwise specified). That has consequences related to another keyword in the question: "add them into a list
Community Wiki
I really liked adamk's answer, suggesting that you use glob()
, from the module of the same name. This allows you to have pattern matching with *
s.
But as other people pointed out in the comments, glob()
can get tripped up over inconsistent slash directions. To help with that, I suggest you use the join()
and expanduser()
functions in the os.path
module, and perhaps the getcwd()
function in the os
module, as well.
As examples:
from glob import glob
# Return everything under C:\Users\admin that contains a folder called wlp.
glob('C:\Users\admin\*\wlp')
The above is terrible - the path has been hardcoded and will only ever work on Windows between the drive name and the \
s being hardcoded into the path.
from glob import glob
from os.path import join
# Return everything under Users, admin, that contains a folder called wlp.
glob(join('Users', 'admin', '*', 'wlp'))
The above works better, but it relies on the folder name Users
which is often found on Windows and not so often found on other OSs. It also relies on the user having a specific name, admin
.
from glob import glob
from os.path import expanduser, join
# Return everything under the user directory that contains a folder called wlp.
glob(join(expanduser('~'), '*', 'wlp'))
This works perfectly across all platforms.
Another great example that works perfectly across platforms and does something a bit different:
from glob import glob
from os import getcwd
from os.path import join
# Return everything under the current directory that contains a folder called wlp.
glob(join(getcwd(), '*', 'wlp'))
Hope these examples help you see the power of a few of the functions you can find in the standard Python library modules.
ArtOfWarfare
def list_files(path):
# returns a list of names (with extension, without full path) of all files
# in folder path
files = []
for name in os.listdir(path):
if os.path.isfile(os.path.join(path, name)):
files.append(name)
return files
Apogentus
If you are looking for a Python implementation of find, this is a recipe I use rather frequently:
from findtools.find_files import (find_files, Match)
# Recursively find all *.sh files in **/usr/bin**
sh_files_pattern = Match(filetype='f', name='*.sh')
found_files = find_files(path='/usr/bin', match=sh_files_pattern)
for found_file in found_files:
print found_file
So I made a PyPI package out of it and there is also a GitHub repository. I hope that someone finds it potentially useful for this code.
Yauhen Yakimovich
For greater results, you can use listdir()
method of the os
module along with a generator (a generator is a powerful iterator that keeps its state, remember?). The following code works fine with both versions: Python 2 and Python 3.
Here's a code:
import os
def files(path):
for file in os.listdir(path):
if os.path.isfile(os.path.join(path, file)):
yield file
for file in files("."):
print (file)
The listdir()
method returns the list of entries for the given directory. The method os.path.isfile()
returns True
if the given entry is a file. And the yield
operator quits the func but keeps its current state, and it returns only the name of the entry detected as a file. All the above allows us to loop over the generator function.
Andy Fedoroff
Returning a list of absolute filepaths, does not recurse into subdirectories
L = [os.path.join(os.getcwd(),f) for f in os.listdir('.') if os.path.isfile(os.path.join(os.getcwd(),f))]
The2ndSon
A wise teacher told me once that:
When there are several established ways to do something, none of them is good for all cases.
I will thus add a solution for a subset of the problem: quite often, we only want to check whether a file matches a start string and an end string, without going into subdirectories. We would thus like a function that returns a list of filenames, like:
filenames = dir_filter('foo/baz', radical='radical', extension='.txt')
If you care to first declare two functions, this can be done:
def file_filter(filename, radical='', extension=''):
"Check if a filename matches a radical and extension"
if not filename:
return False
filename = filename.strip()
return(filename.startswith(radical) and filename.endswith(extension))
def dir_filter(dirname='', radical='', extension=''):
"Filter filenames in directory according to radical and extension"
if not dirname:
dirname = '.'
return [filename for filename in os.listdir(dirname)
if file_filter(filename, radical, extension)]
This solution could be easily generalized with regular expressions (and you might want to add a pattern
argument, if you do not want your patterns to always stick to the start or end of the filename).
fralau
import os
import os.path
def get_files(target_dir):
item_list = os.listdir(target_dir)
file_list = list()
for item in item_list:
item_dir = os.path.join(target_dir,item)
if os.path.isdir(item_dir):
file_list += get_files(item_dir)
else:
file_list.append(item_dir)
return file_list
Here I use a recursive structure.
pah8J
Using generators
import os
def get_files(search_path):
for (dirpath, _, filenames) in os.walk(search_path):
for filename in filenames:
yield os.path.join(dirpath, filename)
list_files = get_files('.')
for filename in list_files:
print(filename)
shantanoo
Another very readable variant for Python 3.4+ is using pathlib.Path.glob:
from pathlib import Path
folder = '/foo'
[f for f in Path(folder).glob('*') if f.is_file()]
It is simple to make more specific, e.g. only look for Python source files which are not symbolic links, also in all subdirectories:
[f for f in Path(folder).glob('**/*.py') if not f.is_symlink()]
fhchl
For Python 2:
pip install rglob
Then do
import rglob
file_list = rglob.rglob("/home/base/dir/", "*")
print file_list
chris-piekarski
I will provide a sample one liner where sourcepath and file type can be provided as input. The code returns a list of filenames with csv extension. Use . in case all files needs to be returned. This will also recursively scans the subdirectories.
[y for x in os.walk(sourcePath) for y in glob(os.path.join(x[0], '*.csv'))]
Modify file extensions and source path as needed.
Vinodh Krishnaraju
dircache is "Deprecated since version 2.6: The dircache module has been removed in Python 3.0."
import dircache
list = dircache.listdir(pathname)
i = 0
check = len(list[0])
temp = []
count = len(list)
while count != 0:
if len(list[i]) != check:
temp.append(list[i-1])
check = len(list[i])
else:
i = i + 1
count = count - 1
print temp
shaji
Retrieved from : http:www.stackoverflow.com/questions/3207219/how-do-i-list-all-files-of-a-directory
'etc. > StackOverFlow in English' 카테고리의 다른 글
Delete commits from a branch in Git (0) | 2023.06.06 |
---|---|
Stash only one file out of multiple files that have changed with Git? (0) | 2023.06.06 |
Finding the index of an item in a list (0) | 2023.06.06 |
When to use LinkedList over ArrayList in Java? (0) | 2023.06.06 |
How do I tell if a regular file does not exist in Bash? (0) | 2023.06.06 |