I have a log file with size of 2.5 GB. Is there any way to split this file into smaller files using windows command prompt?
asked Aug 3, 2015 at 11:40
2
If you have installed Git for Windows, you should have Git Bash installed, since that comes with Git.
Use the split
command in Git Bash to split a file:
-
into files of size 500MB each:
split myLargeFile.txt -b 500m
-
into files with 10000 lines each:
split myLargeFile.txt -l 10000
Tips:
-
If you don’t have Git/Git Bash, download at https://git-scm.com/download
-
If you lost the shortcut to Git Bash, you can run it using
C:\Program Files\Git\git-bash.exe
That’s it!
I always like examples though…
Example:
You can see in this image that the files generated by split
are named xaa
, xab
, xac
, etc.
These names are made up of a prefix and a suffix, which you can specify. Since I didn’t specify what I want the prefix or suffix to look like, the prefix defaulted to x
, and the suffix defaulted to a two-character alphabetical enumeration.
Another Example:
This example demonstrates
- using a filename prefix of
MySlice
(instead of the defaultx
), - the
-d
flag for using numerical suffixes (instead ofaa
,ab
,ac
, etc…), - and the option
-a 5
to tell it I want the suffixes to be 5 digits long:
answered Aug 24, 2018 at 15:40
Josh WitheeJosh Withee
10.1k3 gold badges44 silver badges63 bronze badges
16
Below code split file every 500
@echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=upload.txt
REM Edit this value to change the number of lines per file.
SET LPF=15000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile
REM Do not change beyond this line.
SET SFX=%BFN:~-3%
SET /A LineNum=0
SET /A FileNum=1
For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1
echo %%l >> %SFN%!FileNum!.%SFX%
if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)
)
endlocal
Pause
See below:
https://forums.techguy.org/threads/solved-split-a-100000-line-csv-into-5000-line-csv-files-with-dos-batch.1023949/
answered Oct 16, 2020 at 16:35
Bhanu SinhaBhanu Sinha
1,57613 silver badges10 bronze badges
Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Set rs = CreateObject("ADODB.Recordset")
With rs
.Fields.Append "LineNumber", 4
.Fields.Append "Txt", 201, 5000
.Open
LineCount = 0
Do Until Inp.AtEndOfStream
LineCount = LineCount + 1
.AddNew
.Fields("LineNumber").value = LineCount
.Fields("Txt").value = Inp.readline
.UpDate
Loop
.Sort = "LineNumber ASC"
If LCase(Arg(1)) = "t" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber < " & LCase(Arg(3)) + 1
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber > " & LCase(Arg(3))
End If
ElseIf LCase(Arg(1)) = "b" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber > " & LineCount - LCase(Arg(3))
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
End If
End If
Do While not .EOF
Outp.writeline .Fields("Txt").Value
.MoveNext
Loop
End With
Cut
filter cut {t|b} {i|x} NumOfLines
Cuts the number of lines from the top or bottom of file.
t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines
Example
cscript /nologo filter.vbs cut t i 5 < "%systemroot%\win.ini"
Another way This outputs lines 5001+, adapt for your use. This uses almost no memory.
Do Until Inp.AtEndOfStream
Count = Count + 1
If count > 5000 then
OutP.WriteLine Inp.Readline
End If
Loop
answered Aug 3, 2015 at 19:53
billbill
2151 silver badge3 bronze badges
4
You must have Git Bash installed, and work inside that terminal/shell.
You can use the command split for this task.
For example, this command entered into the command prompt
split YourLogFile.txt -b 500m
creates several files with a size of 500 MByte each. This will take several minutes for a file of your size. You can rename the output files (by default called «xaa», «xab»,… and so on) to *.txt to open it in the editor of your choice.
Make sure to check the help file for the command. You can also split the log file by the number of lines or change the name of your output files.
tested on
- Windows 7 64 bit
- Windows 10 64 bit
answered Jul 27, 2016 at 8:37
WintermuteWintermute
1452 silver badges3 bronze badges
6
Of course there is! Win CMD can do a lot more than just split text files
Split a text file into separate files of ‘max’ lines each:
Split text file (max lines each):
: Initialize
set input=file.txt
set max=10000
set /a line=1 >nul
set /a file=1 >nul
set out=!file!_%input%
set /a max+=1 >nul
echo Number of lines in %input%:
find /c /v "" < %input%
: Split file
for /f "tokens=* delims=[" %i in ('type "%input%" ^| find /v /n ""') do (
if !line!==%max% (
set /a line=1 >nul
set /a file+=1 >nul
set out=!file!_%input%
echo Writing file: !out!
)
REM Write next file
set a=%i
set a=!a:*]=]!
echo:!a:~1!>>out!
set /a line+=1 >nul
)
If above code hangs or crashes, this example code splits files faster (by writing data to intermediate files instead of keeping everything in memory):
eg. To split a file with 7,600 lines into smaller files of maximum 3000 lines.
- Generate regexp string/pattern files with
set
command to be fed to/g
flag offindstr
list1.txt
\[[0-9]\]
\[[0-9][0-9]\]
\[[0-9][0-9][0-9]\]
\[[0-2][0-9][0-9][0-9]\]
list2.txt
\[[3-5][0-9][0-9][0-9]\]
list3.txt
\[[6-9][0-9][0-9][0-9]\]
- Split the file into smaller files:
type "%input%" | find /v /n "" | findstr /b /r /g:list1.txt > file1.txt type "%input%" | find /v /n "" | findstr /b /r /g:list2.txt > file2.txt type "%input%" | find /v /n "" | findstr /b /r /g:list3.txt > file3.txt
- remove prefixed line numbers for each file split:
eg. for the 1st file:
for /f "tokens=* delims=[" %i in ('type "%cd%\file1.txt"') do ( set a=%i set a=!a:*]=]! echo:!a:~1!>>file_1.txt)
Notes:
Works with leading whitespace, blank lines & whitespace lines.
Tested on Win 10 x64 CMD, on 4.4GB text file, 5651982 lines.
answered Apr 20, 2020 at 10:46
ZimbaZimba
2,94320 silver badges27 bronze badges
2
Splitting large text files can be a challenging task when working on a Windows operating system. It is often necessary to divide a large text file into smaller parts to handle or process it in a more manageable way. There are a number of ways to split a large text file on a Windows machine, and this article outlines some of the most common and effective methods to do so.
Method 1: Using Command Prompt
To split a large text file in Windows using Command Prompt, you can use the split
command. This command allows you to divide a file into smaller parts, based on the number of lines or the size of each part. Here are the steps to do it:
- Open Command Prompt by pressing Win+R and typing
cmd
. - Navigate to the folder where the file is located using the
cd
command. For example,cd C:\Users\username\Documents
. - Type the following command to split the file by the number of lines:
split -l 1000 filename.txt part-
This command will split the filename.txt
file into parts of 1000 lines each, and name them part-aa
, part-ab
, part-ac
, etc.
- Alternatively, you can split the file by the size of each part using the
-b
option. For example, to split the file into parts of 1 MB each:
split -b 1m filename.txt part-
This command will split the filename.txt
file into parts of 1 MB each, and name them part-aa
, part-ab
, part-ac
, etc.
- You can also specify a different prefix for the output files using the
-d
option. For example:
split -b 1m -d filename.txt output-
This command will split the filename.txt
file into parts of 1 MB each, and name them output-00
, output-01
, output-02
, etc.
That’s it! You now know how to split a large text file in Windows using Command Prompt.
Method 2: Using PowerShell
Here’s how to split a large text file into smaller files using PowerShell:
-
Open PowerShell by pressing the Windows key + X and selecting «Windows PowerShell (Admin)».
-
Navigate to the directory where the large text file is located.
-
Run the following command to split the file into smaller chunks:
Get-Content largefile.txt -ReadCount 1000 | %{$i=1} {$_ | Out-File -FilePath $("out{0}.txt" -f $i++) -Encoding ASCII}
This command will split the largefile.txt into smaller files with 1000 lines each. The smaller files will be named out1.txt, out2.txt, out3.txt, and so on.
- If you want to split the file into a specific number of smaller files, use the following command:
Get-Content largefile.txt | Out-File -FilePath $("out{0}.txt" -f $i++) -Encoding ASCII -Delimiter "`n" -Width 25000
This command will split the largefile.txt into smaller files with 25,000 characters each. The smaller files will be named out1.txt, out2.txt, out3.txt, and so on.
- If you want to split the file based on a specific delimiter, use the following command:
$delimiter = "###"
$split = Get-Content largefile.txt | Out-String | %{$_.Split($delimiter)}
$i = 1
foreach($s in $split){
$s | Out-File -FilePath $("out{0}.txt" -f $i++) -Encoding ASCII
}
This command will split the largefile.txt into smaller files based on the delimiter «###». The smaller files will be named out1.txt, out2.txt, out3.txt, and so on.
That’s it! You now know how to split a large text file into smaller files using PowerShell.
Method 3: Using Notepad++
- Open Notepad++ and the large text file you want to split.
- Go to the «Search» menu and select «Find».
- In the «Find» tab, enter the maximum number of lines you want in each split file. For example, if you want to split the file into 1000-line chunks, enter «1000».
- Check the «Wrap around» and «Regular expression» options.
- Click «Find All». This will highlight all instances of the maximum number of lines you entered.
- Go to the «Search» menu again and select «Bookmark».
- Select «Bookmark All» to bookmark all instances of the maximum number of lines you entered.
- Go to the «Search» menu again and select «Bookmark».
- Select «Copy Bookmarked Lines» to copy all bookmarked lines to a new file.
- Save the new file with a new name.
Here’s an example of splitting a large text file into 500-line chunks:
- Open Notepad++ and the large text file you want to split.
- Go to the «Search» menu and select «Find».
- In the «Find» tab, enter «^(.*\r?\n){500}».
- Check the «Wrap around» and «Regular expression» options.
- Click «Find All». This will highlight all instances of 500 lines.
- Go to the «Search» menu again and select «Bookmark».
- Select «Bookmark All» to bookmark all instances of 500 lines.
- Go to the «Search» menu again and select «Bookmark».
- Select «Copy Bookmarked Lines» to copy all bookmarked lines to a new file.
- Save the new file with a new name.
This will split the large text file into multiple 500-line files.
There are several online tools available to split large text files in Windows. Here are some of them:
1. SplitFile.net
SplitFile.net is a free online tool that allows you to split large text files into smaller ones. Here is an example of how to use it:
import requests
url = "https://example.com/large_file.txt"
split_size = 1000000
response = requests.get(url)
for i, chunk in enumerate(response.iter_content(chunk_size=split_size)):
with open(f"split_file_{i}.txt", "wb") as f:
f.write(chunk)
2. TextMechanic.com
TextMechanic.com is another free online tool that allows you to split large text files. Here is an example of how to use it:
import requests
url = "https://example.com/large_file.txt"
lines_per_file = 10000
response = requests.get(url)
lines = response.content.decode().split("\n")
for i, chunk in enumerate(range(0, len(lines), lines_per_file)):
with open(f"split_file_{i}.txt", "w") as f:
f.write("\n".join(lines[chunk:chunk+lines_per_file]))
3. Online-Convert.com
Online-Convert.com is a free online tool that allows you to split large text files into smaller ones. Here is an example of how to use it:
import requests
url = "https://example.com/large_file.txt"
split_size = 1000000
response = requests.post("https://www.online-convert.com/result/{}".format(
requests.post("https://www.online-convert.com/form-split-text", data={
"size": split_size,
"url": url,
}).url.split("/")[-1]
))
for i, chunk in enumerate(response.iter_content(chunk_size=split_size)):
with open(f"split_file_{i}.txt", "wb") as f:
f.write(chunk)
These are just a few examples of how to split large text files using online tools. There are many other tools available, each with their own unique features and capabilities.
The GNU Core Utils package (available here for Windows) includes the Split utility.
The --help
documentation is as follows:
Usage: split [OPTION] [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
size is 1000 lines, and default PREFIX is `x'. With no INPUT, or when INPUT
is -, read standard input.
Mandatory arguments to long options are mandatory for short options too.
-a, --suffix-length=N use suffixes of length N (default 2)
-b, --bytes=SIZE put SIZE bytes per output file
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file
-d, --numeric-suffixes use numeric suffixes instead of alphabetic
-l, --lines=NUMBER put NUMBER lines per output file
--verbose print a diagnostic to standard error just
before each output file is opened
--help display this help and exit
--version output version information and exit
SIZE may have a multiplier suffix: b for 512, k for 1K, m for 1 Meg.
For example, to split input.txt into 100Mb chunks, only splitting at the ends of lines,
split input.txt -C 100m
will give you output files named xaa, xab, xac, etc.
Если для передачи, хранения или других целей вам потребовалось разделить большой файл на несколько частей, сделать это в Windows 11, 10 и предыдущих версиях системы можно несколькими способами: с помощью команд PowerShell и командной строки, в сторонних программах, или с помощью онлайн-сервисов, впрочем в последнем случае ОС не будет играть роли.
В этой инструкции подробно о нескольких способах разделить большой файл на несколько частей, а также кратко о разделении файлов различных типов: pdf и txt, видео и других.
Способы разделить большой файл на части
Эта часть инструкции не касается какого-то определенного типа файлов: разделяется любой файл, независимо от типа и содержимого — это может быть текстовый файл или двоичный, медиа или что-то ещё. Для большинства из них чтение станет невозможным до последующего объединения частей.
Разделение файла на несколько частей в PowerShell
Первая возможность — использовать команды и скрипты PowerShell.
FIleSplitter
Если вы не готовы самостоятельно писать такие скрипты, рекомендую использовать готовый модуль для разделения файлов:
- Запустите PowerShell от имени Администратора. В Windows 11 и Windows 10 это можно сделать, нажав правой кнопкой мыши по кнопке «Пуск» и выбрав пункт «Windows Powershell (Администратор)» или «Терминал Windows (Администратор)».
- Установите модуль FileSplitter с помощью команды
Install-Module -Name FileSplitter
в процессе потребуется подтвердить установку вводом Y и нажатием Enter.
- После того, как модуль установлен, в PowerShell будут поддерживаться две команды: Split-File для разделения файла и Join-File для его объединения.
Примеры использования команд:
Split-File -Path "C:\test.zip" -PartSizeBytes 2.5MB
Делит файл C:\test.zip на части размером 2.5 Мб с именами testzip.00.part, testzip.01.part и далее в том же расположении, где хранится оригинальный файл.
Join-File -Path "C:\test.zip"
Команда ищет файл c:\testzip.00.part и остальные части, объединяет их в файл C:\test.zip
Теперь несколько примеров скриптов PowerShell которые могут быть полезны, если вы захотите проработать свою реализацию.
Разделение текстового файла в PowerShell
Скрипт для разделения файла с текстовым содержимым (txt, log и других) на части установленного в первой строке размера, при этом разделение происходит по строкам без их обрыва посередине:
$upperBound = 1MB $ext = "txt" $rootName = "txt_" $reader = new-object System.IO.StreamReader("C:\text.txt") $count = 1 $fileName = "{0}{1}.{2}" -f ($rootName, $count, $ext) while(($line = $reader.ReadLine()) -ne $null) { Add-Content -path $fileName -value $line if((Get-ChildItem -path $fileName).Length -ge $upperBound) { ++$count $fileName = "{0}{1}.{2}" -f ($rootName, $count, $ext) } } $reader.Close()
Разделение и объединение произвольного двоичного файла
Скрипт для разделения произвольного файла на части:
function Split-Files { [CmdletBinding()] Param ( [Parameter(Mandatory = $true, ValueFromPipeLine = $true, ValueFromPipelineByPropertyName = $true)] [String] $InputFile, [Parameter(Mandatory = $true)] [String] $OutputDirectory, [Parameter(Mandatory = $false)] [String] $OutputFilePrefix = "chunk", [Parameter(Mandatory = $false)] [Int32] $ChunkSize = 4096 ) Begin { Write-Output "Beginning file split..." } Process { if (-not (Test-Path -Path $OutputDirectory)) { New-Item -ItemType Directory $OutputDirectory | Out-Null Write-Verbose "Created OutputDirectory: $OutputDirectory" } $FileStream = [System.IO.File]::OpenRead($InputFile) $ByteChunks = New-Object byte[] $ChunkSize $ChunkNumber = 1 While ($BytesRead = $FileStream.Read($ByteChunks, 0, $ChunkSize)) { $OutputFile = Join-Path -Path $OutputDirectory -ChildPath "$OutputFilePrefix$ChunkNumber" $OutputStream = [System.IO.File]::OpenWrite($OutputFile) $OutputStream.Write($ByteChunks, 0, $BytesRead) $OutputStream.Close() Write-Verbose "Wrote File: $OutputFile" $ChunkNumber += 1 } } End { Write-Output "Finished splitting file." } } function Unsplit-Files { [CmdletBinding()] Param ( [Parameter(Mandatory = $true)] [String] $InputDirectory, [Parameter(Mandatory = $false)] [String] $InputFilePrefix = "chunk", [Parameter(Mandatory = $true)] [String] $OutputDirectory, [Parameter(Mandatory = $true)] [String] $OutputFile ) Begin { Write-Output "Beginning file unsplit..." } Process { if (-not (Test-Path -Path $OutputDirectory)) { New-Item -ItemType Directory $OutputDirectory | Out-Null Write-Verbose "Created OutputDirectory: $OutputDirectory" } $OutputPath = Join-Path -Path $OutputDirectory -ChildPath $OutputFile $OutputStream = [System.Io.File]::OpenWrite($OutputPath) $ChunkNumber = 1 $InputFilename = Join-Path -Path $InputDirectory -ChildPath "$InputFilePrefix$ChunkNumber" while (Test-Path $InputFilename) { $FileBytes = [System.IO.File]::ReadAllBytes($InputFilename) $OutputStream.Write($FileBytes, 0, $FileBytes.Count) Write-Verbose "Unsplit File: $InputFilename" $ChunkNumber += 1 $InputFilename = Join-Path -Path $InputDirectory -ChildPath "$InputFilePrefix$ChunkNumber" } $OutputStream.close() } End { Write-Output "Finished unsplitting file." } }
Пример использования (импорт модуля, разбиение файла и объединение):
Import-Module C:\Split-Files.ps1 Split-Files -InputFile "путь_к_большому_файлу.zip" -OutputDirectory "путь_к_месту_сохранения" -ChunkSize РАЗМЕР_ЧАСТИ_В_БАЙТАХ -Verbose Unsplit-Files -InputDirectory "путь_к_месту_хранения_частей_файла" -OutputDirectory "путь_к_месту_сохранению_объединенного_файла" -OutputFile имя_объединенного_файла.zip
MakeCab
В Windows присутствует встроенная утилита для создания файлов .cab, которую можно использовать для разделения файла на части. Порядок действий:
- Создайте текстовый файл ddf.txt с содержимым:
.Set CabinetNameTemplate=test_*.cab; <-- Enter chunk name format .Set MaxDiskSize=900000; <-- Здесь вводим размер части .Set ClusterSize=1000 .Set Cabinet=on; .Set Compress=off; .set CompressionType=LZX; .set CompressionMemory=21 .Set DiskDirectoryTemplate=; путь_к_исходному_файлу
- Используйте команду
makecab /f путь_к_файлу_ddf.txt
в командной строке
- В результате в текущей рабочей папке командной строки будут созданы файлы .cab заданного размера.
- Для объединения файлов cab в исходный файл используйте команду
extrac32 filecab путь_к_объединенному_файлу
указывая путь к первому файл в очереди в качестве первого параметра.
Сторонние программы с функциями разделения файлов
Существуют сторонние приложения, специально предназначенные для разделения файлов на части, а также инструменты, которые имеют такую возможность как часть своего функционала. Начнем с самого распространенного варианта — архиваторы.
Архиваторы
Большинство архиваторов имеют возможность разделить создаваемый архив на несколько томов заданного размера. Если нам не требуется непосредственно архивация и нужно сэкономить место, достаточно создать архив без сжатия.
Например, в бесплатном архиваторе 7-Zip достаточно выбрать файл или файлы, нажать «Добавить», а потом настроить архив и размеры файлов, на которые он будет разбит, как на скриншоте ниже:
В WinRAR необходимые действия выполняются тем же образом, пример — на скриншоте:
В обоих случаях размер тома архива можно выбрать из списка, либо ввести вручную в соответствии с вашими потребностями. При использовании этого метода рекомендую использовать формат ZIP, как самый поддерживаемый.
Получить исходный файл можно просто положив все файлы архива в одно расположения и распаковав архив любым архиватором.
Total Commander
Известный многим архиватор Total Commander имеет опцию разбивки и сбора файлов в меню «Файл».
Достаточно выбрать файл на компьютере и использовать указанный пункт меню для разбивки файла с заданным размером частей.
В дальнейшем возможна сборка файла с помощью того же Total Commander.
Специальные утилиты для разбивки файлов
Также в Интернете вы можете найти множество утилит, которые специально предназначены для разделения файлов и их повторной сборки.
KFK File Splitter
Бесплатная утилита KFK имеет интерфейс на русском языке, достаточно понятный, чтобы объяснять, как именно им пользоваться не потребовалось, все действия для использования разбивки и обратной сборки очевидны:
Официальный сайт для загрузки KFK File Splitter — https://www.kcsoftwares.com/?kfk
FFSJ (File Splitter & Joiner)
FFSJ — программа, очень похожая на первую в списке, но без русского языка интерфейса. Выглядит как на изображении ниже:
В программе представлены две основные вкладки — для разделения и объединения файлов, а также третья — для просмотра контрольных сумм файлов.
GSplit
GSplit — одна из самых популярных программ для разделения файлов с хорошим набором дополнительных функций, но, к сожалению, без русского языка интерфейса.
Порядок простого использования GSplit:
- Выбираем исходный файл или несколько файлов в пункте «Original File».
- Задаем место сохранения разделенного файла в пункте «Destination Folder»
- Задаем размеры частей и тип частей в разделе Pieces — Type and Size.
- Запускаем разделение кнопкой Split.
В дальнейшем, когда потребуется, можно будет использовать кнопку Unite для объединения частей файлов.
Если требуется более простой подход и вариант настройки — используйте кнопку «Express» в меню программы. Также в инструменте вы найдете возможность создания «самообъединеняющихся» частей файлов: раздел Self-Uniting.
Скачать GSplit можно с официального сайта https://www.gdgsoft.com/gsplit
File Splitter (утилита командной строки)
Если вам требуется использование функций разделения файлов в командной строке, можно использовать консольную утилиту File Splitter, доступную бесплатно на GitHub разработчика https://github.com/dubasdey/File-Splitter
Пример использования утилиты:
fsplit -split 1024 kb c:\file.txt
Для объединения используем команду copy с параметром /a для текстовых и /b для произвольных двоичных файлов, пример:
copy /A test1.txt+test2.txt file.txt
Часто пользователям требуется разделить не произвольный файл, а вполне конкретный и не на части, которые нужно затем соединять, а на фрагменты, каждый из которых можно просмотреть отдельно. Для этого можно использовать соответствующие редакторы PDF, редакторы аудио и видеоредакторы. Также может иметь смысл использование онлайн-сервисов:
Для разделения PDF файлов на части или страницы:
- Официальный инструмент разделения PDF файлов от Adobe — https://www.adobe.com/acrobat/online/split-pdf.html
- Неофициальный онлайн-сервис, но на русском: https://pdf.io/ru/split/
- И множество других.
Для разделения MP3:
- Сервис разделения аудио-файлов от veed.io
- Aspose Audio Splitter https://products.aspose.app/audio/ru/splitter/mp3
- И другие — в Интернете подобных онлайн-сервисов предостаточно.
Надеюсь, есть читатели, для которых информация окажется востребованной. Если остаются вопросы или, возможно, есть дополнения к статье, жду вашего комментария.
I have a log file with size of 2.5 GB. Is there any way to split this file into smaller files using windows command prompt?
This question is related to
windows
text
cmd
split
size
The answer is
If you have installed Git for Windows, you should have Git Bash installed, since that comes with Git.
Use the split
command in Git Bash to split a file:
-
into files of size 500MB each:
split myLargeFile.txt -b 500m
-
into files with 10000 lines each:
split myLargeFile.txt -l 10000
Tips:
-
If you don’t have Git/Git Bash, download at https://git-scm.com/download
-
If you lost the shortcut to Git Bash, you can run it using
C:\Program Files\Git\git-bash.exe
That’s it!
I always like examples though…
Example:
You can see in this image that the files generated by split
are named xaa
, xab
, xac
, etc.
These names are made up of a prefix and a suffix, which you can specify. Since I didn’t specify what I want the prefix or suffix to look like, the prefix defaulted to x
, and the suffix defaulted to a two-character alphabetical enumeration.
Another Example:
This example demonstrates
- using a filename prefix of
MySlice
(instead of the defaultx
), - the
-d
flag for using numerical suffixes (instead ofaa
,ab
,ac
, etc…), - and the option
-a 5
to tell it I want the suffixes to be 5 digits long:
Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
Set rs = CreateObject("ADODB.Recordset")
With rs
.Fields.Append "LineNumber", 4
.Fields.Append "Txt", 201, 5000
.Open
LineCount = 0
Do Until Inp.AtEndOfStream
LineCount = LineCount + 1
.AddNew
.Fields("LineNumber").value = LineCount
.Fields("Txt").value = Inp.readline
.UpDate
Loop
.Sort = "LineNumber ASC"
If LCase(Arg(1)) = "t" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber < " & LCase(Arg(3)) + 1
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber > " & LCase(Arg(3))
End If
ElseIf LCase(Arg(1)) = "b" then
If LCase(Arg(2)) = "i" then
.filter = "LineNumber > " & LineCount - LCase(Arg(3))
ElseIf LCase(Arg(2)) = "x" then
.filter = "LineNumber < " & LineCount - LCase(Arg(3)) + 1
End If
End If
Do While not .EOF
Outp.writeline .Fields("Txt").Value
.MoveNext
Loop
End With
Cut
filter cut {t|b} {i|x} NumOfLines
Cuts the number of lines from the top or bottom of file.
t - top of the file
b - bottom of the file
i - include n lines
x - exclude n lines
Example
cscript /nologo filter.vbs cut t i 5 < "%systemroot%\win.ini"
Another way This outputs lines 5001+, adapt for your use. This uses almost no memory.
Do Until Inp.AtEndOfStream
Count = Count + 1
If count > 5000 then
OutP.WriteLine Inp.Readline
End If
Loop
Below code split file every 500
@echo off
setlocal ENABLEDELAYEDEXPANSION
REM Edit this value to change the name of the file that needs splitting. Include the extension.
SET BFN=upload.txt
REM Edit this value to change the number of lines per file.
SET LPF=15000
REM Edit this value to change the name of each short file. It will be followed by a number indicating where it is in the list.
SET SFN=SplitFile
REM Do not change beyond this line.
SET SFX=%BFN:~-3%
SET /A LineNum=0
SET /A FileNum=1
For /F "delims==" %%l in (%BFN%) Do (
SET /A LineNum+=1
echo %%l >> %SFN%!FileNum!.%SFX%
if !LineNum! EQU !LPF! (
SET /A LineNum=0
SET /A FileNum+=1
)
)
endlocal
Pause
See below:
https://forums.techguy.org/threads/solved-split-a-100000-line-csv-into-5000-line-csv-files-with-dos-batch.1023949/
Of course there is! Win CMD can do a lot more than just split text files
Split a text file into separate files of ‘max’ lines each:
Split text file (max lines each):
: Initialize
set input=file.txt
set max=10000
set /a line=1 >nul
set /a file=1 >nul
set out=!file!_%input%
set /a max+=1 >nul
echo Number of lines in %input%:
find /c /v "" < %input%
: Split file
for /f "tokens=* delims=[" %i in ('type "%input%" ^| find /v /n ""') do (
if !line!==%max% (
set /a line=1 >nul
set /a file+=1 >nul
set out=!file!_%input%
echo Writing file: !out!
)
REM Write next file
set a=%i
set a=!a:*]=]!
echo:!a:~1!>>out!
set /a line+=1 >nul
)
If above code hangs or crashes, this example code splits files faster (by writing data to intermediate files instead of keeping everything in memory):
eg. To split a file with 7,600 lines into smaller files of maximum 3000 lines.
- Generate regexp string/pattern files with
set
command to be fed to/g
flag offindstr
list1.txt
\[[0-9]\]
\[[0-9][0-9]\]
\[[0-9][0-9][0-9]\]
\[[0-2][0-9][0-9][0-9]\]
list2.txt
\[[3-5][0-9][0-9][0-9]\]
list3.txt
\[[6-9][0-9][0-9][0-9]\]
- Split the file into smaller files:
type "%input%" | find /v /n "" | findstr /b /r /g:list1.txt > file1.txt type "%input%" | find /v /n "" | findstr /b /r /g:list2.txt > file2.txt type "%input%" | find /v /n "" | findstr /b /r /g:list3.txt > file3.txt
- remove prefixed line numbers for each file split:
eg. for the 1st file:
for /f "tokens=* delims=[" %i in ('type "%cd%\file1.txt"') do ( set a=%i set a=!a:*]=]! echo:!a:~1!>>file_1.txt)
Notes:
Works with leading whitespace, blank lines & whitespace lines.
Tested on Win 10 x64 CMD, on 4.4GB text file, 5651982 lines.
you can split using a third party software http://www.hjsplit.org/, for example give yours input that could be upto 9GB and then split, in my case I split 10 MB each
You can use the command split for this task.
For example this command entered into the command prompt
split YourLogFile.txt -b 500m
creates several files with a size of 500 MByte each. This will take several minutes for a file of your size. You can rename the output files (by default called «xaa», «xab»,… and so on) to *.txt to open it in the editor of your choice.
Make sure to check the help file for the command. You can also split the log file by number of lines or change the name of your output files.
(tested on Windows 7 64 bit)
Questions with windows tag:
• «Permission Denied» trying to run Python on Windows 10
• A fatal error occurred while creating a TLS client credential. The internal error state is 10013
• How to install OpenJDK 11 on Windows?
• I can’t install pyaudio on Windows? How to solve «error: Microsoft Visual C++ 14.0 is required.»?
• git clone: Authentication failed for <URL>
• How to avoid the «Windows Defender SmartScreen prevented an unrecognized app from starting warning»
• XCOPY: Overwrite all without prompt in BATCH
• Laravel 5 show ErrorException file_put_contents failed to open stream: No such file or directory
• how to open Jupyter notebook in chrome on windows
• Tensorflow import error: No module named ‘tensorflow’
• git clone error: RPC failed; curl 56 OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054
• bash: npm: command not found?
• Anaconda Installed but Cannot Launch Navigator
• AttributeError: module ‘cv2.cv2’ has no attribute ‘createLBPHFaceRecognizer’
• How to install pandas from pip on windows cmd?
• ‘ls’ in CMD on Windows is not recognized
• Copy Files from Windows to the Ubuntu Subsystem
• Tensorflow r1.0 : could not a find a version that satisfies the requirement tensorflow
• Kill tomcat service running on any port, Windows
• python pip on Windows — command ‘cl.exe’ failed
• How to install pip3 on Windows?
• Mount current directory as a volume in Docker on Windows 10
• Specifying Font and Size in HTML table
• Why Local Users and Groups is missing in Computer Management on Windows 10 Home?
• Command to run a .bat file
• How do I force Robocopy to overwrite files?
• Windows- Pyinstaller Error «failed to execute script » When App Clicked
• How to completely uninstall Android Studio from windows(v10)?
• Docker for Windows error: «Hardware assisted virtualization and data execution protection must be enabled in the BIOS»
• How do I kill the process currently using a port on localhost in Windows?
• Error : getaddrinfo ENOTFOUND registry.npmjs.org registry.npmjs.org:443
• How to add a custom CA Root certificate to the CA Store used by pip in Windows?
• How to reset the use/password of jenkins on windows?
• npm ERR! Error: EPERM: operation not permitted, rename
• CMD (command prompt) can’t go to the desktop
• Xampp-mysql — «Table doesn’t exist in engine» #1932
• Change drive in git bash for windows
• «OverflowError: Python int too large to convert to C long» on windows but not mac
• Visual studio code terminal, how to run a command with administrator rights?
• ImportError: cannot import name NUMPY_MKL
• Pip — Fatal error in launcher: Unable to create process using ‘»‘
• Installing tensorflow with anaconda in windows
• Where does Anaconda Python install on Windows?
• PermissionError: [Errno 13] Permission denied
• How to restart a windows service using Task Scheduler
• How to install xgboost in Anaconda Python (Windows platform)?
• NPM stuck giving the same error EISDIR: Illegal operation on a directory, read at error (native)
• Can’t access 127.0.0.1
• anaconda — path environment variable in windows
• Global npm install location on windows?
Questions with text tag:
• Difference between opening a file in binary vs text
• How do I center text vertically and horizontally in Flutter?
• How to `wget` a list of URLs in a text file?
• Convert txt to csv python script
• Reading local text file into a JavaScript array
• Python: How to increase/reduce the fontsize of x and y tick labels?
• How can I insert a line break into a <Text> component in React Native?
• How to split large text file in windows?
• Copy text from nano editor to shell
• Atom menu is missing. How do I re-enable
• Setting a max character length in CSS
• Android EditText view Floating Hint in Material Design
• Difference between VARCHAR and TEXT in MySQL
• Editing legend (text) labels in ggplot
• Extracting text OpenCV
• Input type «number» won’t resize
• How to display text in pygame?
• How can I use a batch file to write to a text file?
• Basic text editor in command prompt?
• How to count the number of words in a sentence, ignoring numbers, punctuation and whitespace?
• How to remove text before | character in notepad++
• how to customise input field width in bootstrap 3
• How to set text size in a button in html
• How do I append text to a file?
• Writing new lines to a text file in PowerShell
• How to read existing text files without defining path
• How to place Text and an Image next to each other in HTML?
• Changing background color of text box input not working when empty
• Indent starting from the second line of a paragraph with CSS
• Align text to the bottom of a div
• How do I find all files containing specific text on Linux?
• Find specific string in a text file with VBS script
• How to convert text column to datetime in SQL
• Output grep results to text file, need cleaner output
• Javascript change color of text and background to input value
• Using BufferedReader to read Text File
• Saving a text file on server using JavaScript
• Java: print contents of text file to screen
• How to add text to an existing div with jquery
• Making text background transparent but not text itself
• How to center a <p> element inside a <div> container?
• How to read a text file into a list or an array with Python
• Matplotlib scatter plot with different text at each data point
• Text on image mouseover?
• how to align text vertically center in android
• How to read text file in JavaScript
• Text border using css (border around text)
• How to remove non UTF-8 characters from text file
• How to print Two-Dimensional Array like table
• Read a text file in R line by line
Questions with cmd tag:
• ‘ls’ is not recognized as an internal or external command, operable program or batch file
• » is not recognized as an internal or external command, operable program or batch file
• XCOPY: Overwrite all without prompt in BATCH
• VSCode Change Default Terminal
• How to install pandas from pip on windows cmd?
• ‘ls’ in CMD on Windows is not recognized
• Command to run a .bat file
• VMware Workstation and Device/Credential Guard are not compatible
• How do I kill the process currently using a port on localhost in Windows?
• how to run python files in windows command prompt?
• CMD (command prompt) can’t go to the desktop
• NPM stuck giving the same error EISDIR: Illegal operation on a directory, read at error (native)
• What is the reason for the error message «System cannot find the path specified»?
• Execute a batch file on a remote PC using a batch file on local PC
• How to split large text file in windows?
• PHP is not recognized as an internal or external command in command prompt
• Change directory in Node.js command prompt
• How to disable Hyper-V in command line?
• How do I create a shortcut via command-line in Windows?
• Windows equivalent of ‘touch’ (i.e. the node.js way to create an index.html)
• How to run Pip commands from CMD
• Curl not recognized as an internal or external command, operable program or batch file
• Windows CMD command for accessing usb?
• Create a batch file to run an .exe with an additional parameter
• Find Process Name by its Process ID
• Set proxy through windows command line including login parameters
• Open a Web Page in a Windows Batch FIle
• find path of current folder — cmd
• Can’t check signature: public key not found
• Running CMD command in PowerShell
• Ping with timestamp on Windows CLI
• The filename, directory name, or volume label syntax is incorrect inside batch
• PowerShell The term is not recognized as cmdlet function script file or operable program
• «if not exist» command in batch file
• Windows 7 — ‘make’ is not recognized as an internal or external command, operable program or batch file
• Using a batch to copy from network drive to C: or D: drive
• Split string with string as delimiter
• «Could not find or load main class» Error while running java program using cmd prompt
• Find Number of CPUs and Cores per CPU using Command Prompt
• Set windows environment variables with a batch file
• Executing set of SQL queries using batch file?
• CMD what does /im (taskkill)?
• How to run a command in the background on Windows?
• How to run different python versions in cmd
• Open a URL without using a browser from a batch file
• How to fix ‘.’ is not an internal or external command error
• Redirecting Output from within Batch file
• UTF-8 in Windows 7 CMD
• How to copy a folder via cmd?
• Basic text editor in command prompt?
Questions with split tag:
• Parameter «stratify» from method «train_test_split» (scikit Learn)
• Pandas split DataFrame by column value
• How to split large text file in windows?
• Attribute Error: ‘list’ object has no attribute ‘split’
• Split function in oracle to comma separated values with automatic sequence
• How would I get everything before a : in a string Python
• Split String by delimiter position using oracle SQL
• JavaScript split String with white space
• Split a String into an array in Swift?
• Split pandas dataframe in two if it has more than 10 rows
• Split text file into smaller multiple text file using command line
• How to split a python string on new line characters
• Uncaught TypeError: Cannot read property ‘split’ of undefined
• Split string with string as delimiter
• Split text with ‘\r\n’
• Split string in JavaScript and detect line break
• How to split CSV files as per number of rows specified?
• Splitting dataframe into multiple dataframes
• Splitting a dataframe string column into multiple different columns
• Get first word of string
• Split a large dataframe into a list of data frames based on common value in column
• split string only on first instance — java
• Regex to split a CSV
• Python 2: AttributeError: ‘list’ object has no attribute ‘strip’
• splitting a string based on tab in the file
• Excel CSV. file with more than 1,048,576 rows of data
• Reading a text file and splitting it into single words in python
• Split a string into array in Perl
• Java equivalent to Explode and Implode(PHP)
• How to split a string at the first `/` (slash) and surround part of it in a `<span>`?
• How to split a string and assign it to variables
• Splitting String with delimiter
• how to get the last part of a string before a certain character?
• Split string into tokens and save them in an array
• Splitting on last delimiter in Python string?
• How to split the name string in mysql?
• Java string split with «.» (dot)
• Java String split removed empty values
• Java replace all square brackets in a string
• Java split string to array
• Parse (split) a string in C++ using string delimiter (standard C++)
• How to extract a string between two delimiters
• Split string based on regex
• String split on new line, tab and some number of spaces
• Split a List into smaller lists of N size
• T-SQL split string
• How to split a comma-separated string?
• How to split a string into an array in Bash?
• What is causing the error `string.split is not a function`?
• Python read in string from file and split it into values
Questions with size tag:
• PySpark 2.0 The size or shape of a DataFrame
• How to set label size in Bootstrap
• How to create a DataFrame of random integers with Pandas?
• How to split large text file in windows?
• How can I get the size of an std::vector as an int?
• How to load specific image from assets with Swift
• How to find integer array size in java
• Fit website background image to screen size
• How to set text size in a button in html
• How to change font size in html?
• How to reduce a huge excel file
• PHPExcel auto size column width
• Array[n] vs Array[10] — Initializing array with variable vs real number
• How to Get True Size of MySQL Database?
• Google Chrome default opening position and size
• Image size (Python, OpenCV)
• creating array without declaring the size — java
• Resizing a button
• size of NumPy array
• how to find 2d array size in c++
• Find files with size in Unix
• How to know the size of the string in bytes?
• How to change legend size with matplotlib.pyplot
• android:layout_height 50% of the screen size
• Getting file size in Python?
• Customize list item bullets using CSS
• Android screen size HDPI, LDPI, MDPI
• Is it really impossible to make a div fit its size to its content?
• change array size
• Set element width or height in Standards Mode
• How can I initialize an array without knowing it size?
• Change the size of a JTextField inside a JBorderLayout
• How much data can a List can hold at the maximum?
• Using jQuery To Get Size of Viewport
• How can I fix the form size in a C# Windows Forms application and not to let user change its size?
• Array Size (Length) in C#
• How to change the size of the font of a JLabel to take the maximum size
• How can I set size of a button?
• How to get the size of a range in Excel
• How many bytes in a JavaScript string?
• How to get the size of the current screen in WPF?
• How do you stretch an image to fill a <div> while keeping the image’s aspect-ratio?
• How many characters in varchar(max)
• Determine the size of an InputStream
• Java JTable setting Column Width
• std::string length() and size() member functions
• Difference between long and int data types
• increase the java heap size permanently?
• How Big can a Python List Get?
• Automatically size JPanel inside JFrame