Can someone provide some light on how to do this? I can do this for regular text or byte array, but not sure how to approach for a pdf. do i stuff the pdf into a byte array first?
Use File.ReadAllBytes
to load the PDF file, and then encode the byte array as normal using Convert.ToBase64String(bytes)
.
There is a way that you can do this in chunks so that you don't have to burn a ton of memory all at once.
.Net includes an encoder that can do the chunking, but it's in kind of a weird place. They put it in the System.Security.Cryptography namespace.
I have tested the example code below, and I get identical output using either my method or Andrew's method above.
Here's how it works: You fire up a class called a CryptoStream. This is kind of an adapter that plugs into another stream. You plug a class called CryptoTransform into the CryptoStream (which in turn is attached to your file/memory/network stream) and it performs data transformations on the data while it's being read from or written to the stream.
Normally, the transformation is encryption/decryption, but .net includes ToBase64 and FromBase64 transformations as well, so we won't be encrypting, just encoding.
Here's the code. I included a (maybe poorly named) implementation of Andrew's suggestion so that you can compare the output.
class Base64Encoder
{
public void Encode(string inFileName, string outFileName)
{
System.Security.Cryptography.ICryptoTransform transform = new System.Security.Cryptography.ToBase64Transform();
using(System.IO.FileStream inFile = System.IO.File.OpenRead(inFileName),
outFile = System.IO.File.Create(outFileName))
using (System.Security.Cryptography.CryptoStream cryptStream = new System.Security.Cryptography.CryptoStream(outFile, transform, System.Security.Cryptography.CryptoStreamMode.Write))
{
// I'm going to use a 4k buffer, tune this as needed
byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = inFile.Read(buffer, 0, buffer.Length)) > 0)
cryptStream.Write(buffer, 0, bytesRead);
cryptStream.FlushFinalBlock();
}
}
public void Decode(string inFileName, string outFileName)
{
System.Security.Cryptography.ICryptoTransform transform = new System.Security.Cryptography.FromBase64Transform();
using (System.IO.FileStream inFile = System.IO.File.OpenRead(inFileName),
outFile = System.IO.File.Create(outFileName))
using (System.Security.Cryptography.CryptoStream cryptStream = new System.Security.Cryptography.CryptoStream(inFile, transform, System.Security.Cryptography.CryptoStreamMode.Read))
{
byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = cryptStream.Read(buffer, 0, buffer.Length)) > 0)
outFile.Write(buffer, 0, bytesRead);
outFile.Flush();
}
}
// this version of Encode pulls everything into memory at once
// you can compare the output of my Encode method above to the output of this one
// the output should be identical, but the crytostream version
// will use way less memory on a large file than this version.
public void MemoryEncode(string inFileName, string outFileName)
{
byte[] bytes = System.IO.File.ReadAllBytes(inFileName);
System.IO.File.WriteAllText(outFileName, System.Convert.ToBase64String(bytes));
}
}
I am also playing around with where I attach the CryptoStream. In the Encode method,I am attaching it to the output (writing) stream, so when I instance the CryptoStream, I use its Write() method.
When I read, I'm attaching it to the input (reading) stream, so I use the read method on the CryptoStream. It doesn't really matter which stream I attach it to. I just have to pass the appropriate Read or Write enumeration member to the CryptoStream's constructor.