How to: Split a File Into Many Files by Using Groups (LINQ)
This example shows one way to merge the contents of two files and then create a set of new files that organize the data in a new way.
To create the data files
Copy these names into a text file that is named names1.txt and save it in your solution folder:
Bankov, Peter Holm, Michael Garcia, Hugo Potra, Cristina Noriega, Fabricio Aw, Kam Foo Beebe, Ann Toyoshima, Tim Guy, Wey Yuan Garcia, Debra
Copy these names into a text file that is named names2.txt and save it in your solution folder: Note that the two files have some names in common.
Liu, Jinghao Bankov, Peter Holm, Michael Garcia, Hugo Beebe, Ann Gilchrist, Beth Myrcha, Jacek Giakoumakis, Leo McLin, Nkenge El Yassir, Mehdi
Example
Class SplitWithGroups
Shared Sub Main()
Dim fileA As String() = System.IO.File.ReadAllLines("../../../names1.txt")
Dim fileB As String() = System.IO.File.ReadAllLines("../../../names2.txt")
' Concatenate and remove duplicate names based on
Dim mergeQuery As IEnumerable(Of String) = fileA.Union(fileB)
' Group the names by the first letter in the last name
Dim groupQuery = From name In mergeQuery
Let n = name.Split(New Char() {","})
Order By n(0)
Group By groupKey = n(0)(0)
Into groupName = Group
' Create a new file for each group that was created
' Note that nested foreach loops are required to access
' individual items with each group.
For Each gGroup In groupQuery
Dim fileName As String = "..'..'..'testFile_" & gGroup.groupKey & ".txt"
Dim sw As New System.IO.StreamWriter(fileName)
Console.WriteLine(gGroup.groupKey)
For Each item In gGroup.groupName
Console.WriteLine(" " & item.name)
sw.WriteLine(item.name)
Next
sw.Close()
Next
' Keep console window open in debug mode.
Console.WriteLine("Files have been written. Press any key to exit.")
Console.ReadKey()
End Sub
End Class
' Console Output:
' A
' Aw, Kam Foo
' B
' Bankov, Peter
' Beebe, Ann
' E
' El Yassir, Mehdi
' G
' Garcia, Hugo
' Garcia, Debra
' Giakoumakis, Leo
' Gilchrist, Beth
' Guy, Wey Yuan
' H
' Holm, Michael
' L
' Liu, Jinghao
' M
' McLin, Nkenge
' Myrcha, Jacek
' N
' Noriega, Fabricio
' P
' Potra, Cristina
' T
' Toyoshima, Tim
class SplitWithGroups
{
static void Main()
{
string[] fileA = System.IO.File.ReadAllLines(@"../../../names1.txt");
string[] fileB = System.IO.File.ReadAllLines(@"../../../names2.txt");
// Concatenate and remove duplicate names based on
// default string comparer
var mergeQuery = fileA.Union(fileB);
// Group the names by the first letter in the last name.
var groupQuery = from name in mergeQuery
let n = name.Split(',')
group name by n[0][0] into g
orderby g.Key
select g;
// Create a new file for each group that was created
// Note that nested foreach loops are required to access
// individual items with each group.
foreach (var g in groupQuery)
{
// Create the new file name.
string fileName = @"../../../testFile_" + g.Key + ".txt";
// Output to display.
Console.WriteLine(g.Key);
// Write file.
using (System.IO.StreamWriter sw = new System.IO.StreamWriter(fileName))
{
foreach (var item in g)
{
sw.WriteLine(item);
// Output to console for example purposes.
Console.WriteLine(" {0}", item);
}
}
}
// Keep console window open in debug mode.
Console.WriteLine("Files have been written. Press any key to exit");
Console.ReadKey();
}
}
/* Output:
A
Aw, Kam Foo
B
Bankov, Peter
Beebe, Ann
E
El Yassir, Mehdi
G
Garcia, Hugo
Guy, Wey Yuan
Garcia, Debra
Gilchrist, Beth
Giakoumakis, Leo
H
Holm, Michael
L
Liu, Jinghao
M
Myrcha, Jacek
McLin, Nkenge
N
Noriega, Fabricio
P
Potra, Cristina
T
Toyoshima, Tim
*/
The program writes a separate file for each group in the same folder as the data files.
Compiling the Code
Create a Visual Studio project that targets the .NET Framework version 3.5. By default, the project has a reference to System.Core.dll and a using directive (C#) or Imports statement (Visual Basic) for the System.Linq namespace. In C# projects, add a using directive for the System.IO namespace.
Copy this code into your project.
Press F5 to compile and run the program.
Press any key to exit the console window.