File Processing

The storage of data in variables and arrays is temporary. Files are used for data persistence permanent retention of large amounts of data. Computers store files on secondary storage devices, such as magnetic disks, optical disks, and tapes. We consider both sequential-access files and random-access files. We compare formatted-data file processing and raw-data file processing. We examine techniques for the input of data form, and output of data to, strings rather than files

File Storing Process
File Storing Process

The Data Hierarchy

Files and Streams

Creating a Sequential-Access File

Reading Data from a Sequential-Access File

Updating Sequential-Access Files

Random Access File

Creating a Random-Access File

Writing Data Randomly to a Random-Access File

Reading Data Sequentially from a Random-Access File

Input/Output of Objects

The Data Hierarchy in C++

It is simple and economical to build electronic devices that can assume two stable states

First stage represent Zero(0)
Second stage represent One(1)

The smallest data item that computers support is called a bit. Each data item, or bit, can assume either the value 0 or the value 1. Programming with data in the low-level form of bits is cumbersome. It is preferable to program with data in forms such as decimal digits(i.e. 0,1,2,3,4,5,6,7,8 and 9 ), letters (i.e. A through Z and a through z) and special symbols (i.e. $.,@,%,&,*,(,),-,+,”,:,?,/ and many others).Digits, letters, and special symbols are referred to as characters. The set of all characters used to write programs and represent data items on a particular computer is called that computer’s character set. Because computers can process only 1s and 0s, every character in a computer’s character set is represented as a pattern of 1s and 0s. Bytes are composed of eight bits.

Data items processed by computers form a data hierarchy, in which data items become larger and more complex in structure as we progress form bits, to characters, to fields and to larger data structures

For example for the record for a particular employee might include the following fields:

1.Employee identification number
2.Name
3.Address
4.Hourly pay rate
5.Number of exemption claimed
6.Year-to-date earnings
7.Amount of taxes withheld

Data hierarchy in C++
Data hierarchy in C++

Thus,a record is a group of related fields.A file is a group of related records.To facilitate the retrieval of specific records from the file, at least one field in each record is choosen as a record key.A record key identifies a record as belonging to a particular person or entity and distinguishes that record from all other records.There are many ways of organizing records in a file.A common type of organization is called a sequential file,in which records typically are stored in order by a record-key field.A group of related files often are stored in a database.A collection of programs designed to create and manage database is called a database management system(DBMS)

Files and Streams in C++

When a file is opened, an object is created, and a stream is associated with the object.The streams associated with these objects provide communication channels between a program and a particular file or device.For example, the cin object(standard-input stream object) enables a program to input data from the key-board or from other devices, the cout object(standard-output stream object) enables a program to output data to the screen or other devices, and the cerr and clog objects(standard error stream objects) enable a program to output error messages to the screen or other devices.

Files and Streams in C++
Files and Streams in C++

To perform file processing in C++,header files <iostream> and <fstream> must be included.Header <fstream> includes the definition for the stream-class templates <basic_ifstream>(for file input), <basic_ofstream>(for file output) and <basic_fstream>(for file input and output).Each class template has a predefined template specialization that enables char I/O.In addition, the <fstream> library provides a set of typedefs that provide aliases for these template specializations.For example, the typedef ifstream represents a specialization of basic_ifstream that enables char input from a file.Similarly,typedef ofstream represents a specialization of basic_ofstream that enables char output to files.

Files are opened by creating objects of these stream template specializations.These templates “derive” from class templates basic_istream,basic_ostream and basic_iostream,respectively.Thus, all member functions, operators and manipulators that belong to these templates also can be applied to file streams

Creating a Sequential-Access File in C++

C++ imposes no structure on a file.Thus, a concept like that of a “record” does not exist in a C++ file.Therefore,the programmer must structure files to meet the application’s requirements.

Creates a sequential-access file that might be used in an account-receivable system to help manage the money owed by a company’s credit clients.

#include<iostream>
using std::cout;
using std::cin;
using std::ios;
using std::cerr;
using std::endl;
#include<fstream>
using std::ofstream;
#include<cstdlib>
int main()
{
//ofstream constructor opens file
ofstream outClientFile("client.dat",ios::out);
//exit program if unable to create file
if(!outClientFile)
{
//overloaded ! operator
cerr<<"File could not be opened"<<endl;
exit(1);
}
cout<<"Enter the account, name,and balance"<<endl
    <<"Enter end-of-file to end input\n";
int account;
char name[30];
double balance;
while(cin>>account>>name>>balance)
{
outClientFile<<account<<' '<<name<<' '<<balance
             <<endl;
cout<<"?";
}
return 0;
}

OUTPUT:

Enter the account, name, and balance
Enter end-of-file to end input
?100 Jones 24.98
?200 Doe 345.67
?300 White 0.00
?400 Stone -42.16
?500 Rich 224.62
?^z

For opening the file the List of file-open modes are given below

List of file open modes

Reading Data from a Sequential-Access File in C++

Filestore data so that data may be retrieved for processing when needed. The previous section demonstrated how to create a file.

Program of Reading of File in C++

#include<iostream>
using std::cout;
using std::cin;
using std::ios;
using std::cerr;
using std::endl;
using std::left;
using std::right;
using std::fixed;
using std::showpoint;
#include<fstream>
using std::ifstream;
#include<iomanip>
using std::setw;
using std::setprecision;
#include<cstdlib>
void outputLine(int,const char*const,double);
int main()
{
ifstream inClientFile("clients.dat",ios::in);
if(!inClientFile)
{
cerr<<"File could not be opened"<<endl;
exit(1);
}
int account;
char name[30];
double balance;
cout<<left<<setw(10)<<"Account"<<setw(13)
    <<"Name"<<"Balance"<<endl<<fixed<<showpoint;
while(inClientFile>>account>>name>>balance)
outputLine(account,name,balance);
return 0; 
}
void outputLine(int account,const char*const name,double balance)
{
cout<<left<<setw(10)<<account<<setw(13)<<name
    <<setw(7)<<setprecision(2)<<right<<balance
    <<endl;
}

OUTPUT:

Account  Name  Balance
100      Rahul  24.98 
200      Raju   345.67
300      Dev    0.00
400      Nitesh  -42.16
500      Satish  224.62 

Open a file for input only(using ios::in)if the file’s contents should not be modified.This prevents unintentional modification of the file’s contents and is an example of the principle of least privilege

Updating Sequential-Access Files in C++

Data that are formatted and written to a file. If the name “White” needs to be changed to “Worthington,” the old name cannot be overwritten without corrupting the file. The record for White was written to the file as

300 White 0.00

If this record were rewritten beginning at the same location in the file using the longer name, the record would be

300 Worthington 0.00

The new record contains six more characters than the original record. Therefore, the characters beyond the second “o” in “Worthington” would overwrite the beginning of the next sequential record in the file. The problem is that in the formatted input/output model using the insertion operator << and the extraction operator >>, fields sand hence records can vary in size. Therefore, the formatted input/output model usually is not used to update records in place.

Such updating can be done awkwardly. For example, to make the preceding name change, the records before 300 White 0.00 in a sequential access file could be copied to a new file, the updated record then would be written to the new file and the records after 300 White 0.00 would be copied to the new file. This requires processing every record in the file to update one record. If many records are being updated in one pass of the file, this technique can be acceptable

Random Access File in C++

Sequential-access files are inappropriate for instant-access applications, in which a particular record must be located immediately. Common instant-access applications are airline-reservation systems, banking systems, point-of-sale systems,(ATM)automated teller machines and other kinds of transaction-processing systems that require rapid access to specific data. A bank might have hundreds of thousands(or even millions) of other customers, yet when a customer uses an automated teller machine, the program checks that customer’s account in seconds for sufficient funds. This kind of instant access is made possible with random-access files. Individual records of a random-access file can be accessed directly(and quickly) without having to search other records.

As we have said, C++ does not impose structure on a file. So the application that wants to use random-access files must create them. A variety of techniques can be used to create random-access files. But the easiest way to require all records in a file be of the same fixed-length. Using fixed-length records makes it easy for a program to calculate(as a function of the record size and the record key) the exact location of any record relative to the beginning of the file.

Data can be inserted into a random-access file without destroying other data in the file. Data stored previously also can be updated or deleted without rewriting the entire file.

Creating a Random-Access File

The ostream member function write outputs a fixed number of bytes, beginning at a specific location in memory, to the specified stream. When the stream is associated with a file, function write writes the data at the location in the file specified by the “put” file-position pointer. The istream member function read inputs a fixed number of bytes from the specified stream to an area in memory beginning at a specified address. If the stream is associated with a file, function read inputs bytes at the location in the file specified by the “get” file-position pointer.

When writing an integer number to a file, instead of using the statement

outFile<<number;

which could print as few as one digit or as many as 11 digits (10 digits plus a sign, each of which requires a single byte of storage) for a four-byte integer, we can use the statement

outFile.write(reinterpret_cast<const char*>(&number),sizeof(number));

which always writes four bytes (on a machine with four-byte integers). Function write expects data type const char* as its first argument: hence, we use operator reinterpret_cast<const char*> to convert the address of number to a const char* pointer. The second argument of write is an integer of type size_t specifying the number of bytes to be written. As we will see, istream function read than can be used to read the four bytes back into integer variable number.

If a program reads unformatted data(written by a write), it must be compiled and executed on a system that is compatible with the program that wrote the data.

Random-access file-processing programs rarely write a single field to a file.

Program for Creating Random-Access File in C++

#include<iostream>
using std::cerr;
using std::endl;
using std::ios;
#include<fstream>
using std::ofstream;
#include<cstdlib>
#include "clientData.h"
int main()
{
ofstream outCredit("credit.dat",ios::binary);
if(!outCredit)
{
cerr<<"File could not be opened"<<endl;
exit(1);
}
ClientData blankClient;
for(int i=0;i<100;i++)
outCredit.write(reinterpret_cast<const char*>(&blankClient),sizeof(ClientData));
return 0;
}

Writing Data Randomly to a Random-Access File

writes data to the file “credit.dat” and uses the combination of ostream functions seekp and write to store data at exact locations in the file, then write outputs the data.

Writing to a random-access file in C++

#include<iostream>
using std::cerr;
using std::endl;
using std::cout;
using std::cin;
using std::ios;
#include<iomanip>
using std::setw;
#include<fstream>
using std::ofstream;
#include<cstlib>
#include "clientData.h"
int main()
{
int accountNumber;
char lastName[15];
char firstName[10];
double balance;
ofstream outCredit("credit.dat",ios::binary);
if(!outCredit)
{
cerr<<"File could not be opened"<<endl;
exit(1);
}
cout<<"Enter account number"
    <<"(1 to 100, 0 to end input)\n";
ClientData client;
cin>>accountNumber;
client.setAccountNumber(accountNumber);
while(client.getAccountNumber()>0 && client.getAccountNumber()<= 100)
{
cout<<"Enter lastname, firstname, balance\n";
cin>>setw(15)>>lastname;
cin>>setw(10)>>firstname;
cin>>balance;
client.setLastName(lastName);
client.setFirstName(firstName);
client.setBalance(balance);
outCredit.seekp((client.getAccountNumber)-1)*sizeof(ClientData));
outCredit.write(reinterpret_cast<const char*>(&client),sizeof(clientData));
cout<<"Enter account number\n";
cin>>accountNumber;
client.setAccountNumber(accountNumber);
}
return 0;
}

OUTPUT:

Enter account number (1 to 100, 0 to end input)
37
Enter lastname, firstname, balance
Vicky Singh 0.00
Enter account number
2922
Enter lastname,firstname,balance
Satish Singh 24.33
5556

Reading Data Sequentially from a Random-Access File in C++

Creating Filename “credit.dat” and reads every record in the “credit.dat” file sequentially, checks each record to determine whether it contains data, and displays formatted outputs for records containing data.

Program of Reading Data Sequentially From A Random-Access File In C++

#include <iostream>
using std::cout;
using std::endl;
using std::ios;
using std::cerr;
using std::left;
using std::right
using std::fixed;
using std::showpoint;
#include<iomanip>
using std::setprecision;
using std::setw;
#include<fstream>
using std::ifstream;
using std::ostream;
#include<cstdlib>
#include "clientData.h"
void outputLine(ostream&,const ClientData&);
int main()
{
ifstream inCredit("credit.dat",ios::in);
if(!inCredit)
{
cerr<<"File could not be opened"<<endl;
exit(1);
}
cout<<left<<setw(10)<<"Account"<<setw(16)
    <<"Last Name"<<setw(11)<<"First Name"<<left
    <<setw(10)<<right<<"Balance"<<endl;
ClientData client;
inCredit.read(reinterpret_cast<char*>(&client),sizeof(ClientData));
while(inCredit && !inCredit.eof())
{
if(client.getAccountNumber()!=0)
outputLine(cout,client);
inCredit.read(reinterpret_cast<char*>(&client),sizeof(ClientData));
}
return ;
}
void outputLine(ostream &output, const ClientData &record)
{
output<<left<<setw(10)<<record.getAccountNumber()
      <<setw(16)<<record.getLastName().data()
      <<setw(11)<<record.getFirstName().data()
      <<setw(10)<<setprecision(2)<<right<<fixed
      <<showpoint<<record.getBalance()<<endl;
}

OUTPUT:

Account  Last Name  First Name   Balance
29        Brown     Nancy        -24.54
33        Dunn      Stacey       314.33
37       Barker     Doug         0.00
88       Smith      Dave         258.34
96       Stone      Sam          34.98

Input/Output of Objects in C++

We accomplished object input by overloading the stream-extraction operator >> for the appropriate istream. We accomplished the object output by overloading the stream-insertion operator << for the appropriate ostream.In both cases, only an object’s data members were input or output, and, in each case, they were in a format meaningful only for objects of that particular abstract data type. An object’s member functions are available internally in the computer and are combined with the data values as these data are input by the overloaded stream-insertion operator.

When object data members are output to a disk file, we lose the object’s type information. We store only data bytes, not type information, on a disk. If the program that reads this data knows the object type to which the data corresponds.the program will read the data into objects of that type.

Translate »