unit5-updated_notes
Unit 5
FILES, MODULES, PACKAGES
Object
Files and exception: text files, reading and writing files, format operator; command line arguments, errors and exceptions, handling exceptions, modules, packages; Illustrative programs: word count, copy file.
Key Terms
File : A file is a container in a computer system for storing information. Files used in computers are similar in features to that of paper documents used in library and office files. In a computer operating system, files can be stored on optical drives, hard drives or other types of storage devices.
Binary file : Binary file is a collection of bytes or a character stream. The data that is written into and read from binary file remain unchanged, with no separation between lines and no use of end-of-line characters and the interpretation of the file is left to the programmer.
Text file : A text file is a stream of characters that can be processed sequentially and logically in the forward direction. The maximum number of characters in each line is limited to 255 characters.
Sequential file access: In case of sequential file access, data is read from or written to a file in a sequential manner while the
position indicator
automatically gets adjusted by the stream I/O functions.Random file access: Random access means reading from or writing to any position in a file without reading or writing all the preceding data by controlling the
position indicator
Record: A record consist of a collection of data fields that conforms to a previously defined structure that can be stored on or retrieved from a file.
Stream: The stream is a common, logical interface to the various devices that comprise the computer and is a logical interface to a file. Although files differ in form and capabilities, all streams are the same.
File management: It basically means all operations related to creating, renaming, deleting, merging, reading, writing, etc. of any type of files.
Path: The path specifies the drive and/or directory (or folder) where the file is located. On PCs, the backslash character is used to separate directory names in a path. Some systems like Unix use the forward slash (/) as the directory separator.
Types of Files
Text files
Data files
Directory files
Binary files
Graphic files
Basic File Operations
Creation of a new file
Modification of data or file attributes
Reading of data from the file
Opening the file in order to make the contents available to other programs
Writing data to the file
Closing or terminating a file operation
Reading Text Files
Demo 1:
Code example
File contents
Output
Demo https://repl.it/@kiteit/ReadFromFileIntro
Writing Text Files
code
output
destination
Demo https://repl.it/@kiteit/WritingToFile
Types of File Modes
r
Opens a file for reading only. The file pointer is placed at the beginning of the file. This is the default mode.
rb
Opens a file for reading only in binary format. The file pointer is placed at the beginning of the file. This is the default mode.
r+
Opens a file for both reading and writing. The file pointer placed at the beginning of the file.
rb+
Opens a file for both reading and writing in binary format. The file pointer placed at the beginning of the file.
w
Opens a file for writing only. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing.
wb
Opens a file for writing only in binary format. Overwrites the file if the file exists. If the file does not exist, creates a new file for writing.
w+
Opens a file for both writing and reading. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.
wb+
Opens a file for both writing and reading in binary format. Overwrites the existing file if the file exists. If the file does not exist, creates a new file for reading and writing.
a
Opens a file for appending. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.
ab
Opens a file for appending in binary format. The file pointer is at the end of the file if the file exists. That is, the file is in the append mode. If the file does not exist, it creates a new file for writing.
a+
Opens a file for both appending and reading. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing.
ab+
Opens a file for both appending and reading in binary format. The file pointer is at the end of the file if the file exists. The file opens in the append mode. If the file does not exist, it creates a new file for reading and writing.
File Methods
close()
Closes the file
fileobject.close()
flush()
Flushes the internal buffer
fileobject.flush()
read()
Returns the file content
data = fileobject.read()
readable()
Returns whether the file stream can be read or not
boolean = fileobject.flush()
readline()
Returns one line from the file
data = fileobject.readline()
readlines()
Returns a list of lines from the file
filelines = fileobject.read()
seek()
Change the file position
filelines = fileobject.seek(offset)
seekable()
Returns whether the file allows us to change the file position
bool = fileobject.seekable()
tell()
Returns the current file position
position = fileobject.tell()
truncate()
Resizes the file to a specified size
fileobject.truncate(size)
writeable()
Returns whether the file can be written to or not
bool = fileobject.writeable()
write()
Writes the specified string to the file
fileobject.write(data)
writelines()
Writes a list of strings to the file
fileobject.writelines(listofstrings[])
Difference between Append and Write Mode
Write (w) mode and Append (a) mode, while opening a file are almost the same. Both are used to write in a file. In both the modes, new file is created if it doesn’t exists already.
The only difference they have is, when you open a file in the write mode, the file is reset, resulting in deletion of any data already present in the file. While in append mode this will not happen. Append mode is used to append or add data to the existing data of file (if any). Hence, when you open a file in Append(a) mode, the cursor is positioned at the end of the present data in the file.
Reading binary files in python
Code example
This is not a text file so we need a spcial program like image viewer to interpret this Demo https://repl.it/@kiteit/BinaryFile
Files
What are files? A file is sequential stream of bytes ending with an end-of-file marker.
As we know, at the time of execution, every program is executed in the main memory. Main memory is volatile and the data would be lost once the program is terminated. If we need the same data again, we have to store the data in a file on the disk. A file is sequential stream of bytes ending with an end-of-file marker.
Storage of data in variables and arrays is temporary—such data is lost when a program terminates. Files are used for permanent retention of data. Computers store files on secondary storage devices, such as hard drives, CDs, DVDs and flash drives. In this chapter, we explain how data files are created, updated and processed by C programs. We consider both sequential-access and random-access file processing.
File Extensions
File extensions We can usually tell if a file is binary or text based on its file extension. This is because by convention the extension reflects the file format, but ultimately it is the file format that dictates whether the file data is binary or text.
Common extensions that are binary file formats:
Common extensions that are text file formats:
Text File Characteristics
By convention, the data in every text file obeys a number of rules:
The text looks readable to a human or at least moderately sane. Even if it contains a heavy proportion of punctuation symbols (like HTML, RTF, and other markup formats), there is some visible structure and it’s not seemingly random garbage.
The data format is usually line-oriented. Each line could be a separate command, or a list of values could put each item on a different line, etc. The maximum number of characters in each line is usually a reasonable value like 100, not like 1000.
The text looks readable to a human or at least moderately sane. Even if it contains a heavy proportion of punctuation symbols (like HTML, RTF, and other markup formats), there is some visible structure and it’s not seemingly random garbage.
Binary File Characteristics
For most software that people use in their daily lives, the software consumes and produces binary files. Examples of such software include Microsoft Office, Adobe Photoshop, and various audio/video/media players. A typical computer user works with mostly binary files and very few text files.
A binary file always needs a matching software to read or write it. For example, an MP3 file can be produced by a sound recorder or audio editor, and it can be played in a music player or audio editor. But an MP3 file cannot be played in an image viewer or a database software.
Some binary formats are popular enough that a wide variety of programs can produce or consume it. Image formats like JPEG are the best example – not only can they be used in image viewers and editors, they can be viewed in web browsers, audio players (for album art), and document software (such as adding a picture into a Word doc).
Reading a File (Extended ....)
There are three methods will help us to do this.
read()
readline()
readlines()
read
It reads entire content from the file and returns data as a string in text mode and byte object in binary mode.
If we want to read desired no of characters from the file we can pass size (any positive value) as parameter
Example
SOURCE FILE
CODE
OUTPUT
Demo: [https://repl.it/@kiteit/ReadFromFile]
With Size Parameter
CODE
OUTPUT
• here size is 3 so 3 characters returned.
Demo: [https://repl.it/@kiteit/ReadFromFileWithSize]
To Think:
What if we pass negative value as size?
What if we pass size that is greater than file size?
readline
It reads a line from a file each time it is called.
Example
Demo: [https://repl.it/@kiteit/ReadLineFromFile]
To Think:
We need to perform readline operation for every line of that the sources file has.
What if number of readline operations is lower than number of lines in the source file?
What if we over run it? no of readline operations > no of lines in source file.
How do we know the no of lines in the file?
how do we stop readline operation after reaching the end of file?
What about the idea of using loops?
Controlled Readline
CODE
OUTPUT
while loop stops when readline returns an empty string.
readlines
It returns list of strings each string is a line from file.
If we want to read desired no of bytes from the file we can pass size (any positive value) as parameter
CODE
OUTPUT
It returns list of lines
Demo:[https://repl.it/@kiteit/ReadLinesFromFile]
With Size Parameter
CODE
OUTPUT
• here size in bytes is 7
Demo:[ https://repl.it/@kiteit/ReadLinesFromFileWithSize]
Last updated