Current location - Music Encyclopedia - QQ Music - Ask for high marks! Memory aspect
Ask for high marks! Memory aspect
chapter ii storage and processing of enterprise information

the core of the information age is undoubtedly information technology, and the core of information technology lies in information processing and storage.

2.1 data representation

2.1.1 representation of information, numbers and characters

1. information representation

there are two states of the logic component storing data, namely high potential and low potential, which correspond to "1" and "" respectively. In a computer, if a potential state represents one information unit, then a binary number can represent two information units. If a 2-bit binary number is used, it can represent 4 information units; Using 3-bit binary numbers, 8 information units can be represented. There is a power relationship between the number of bits of a binary number and the information unit that can be represented. That is to say, when using n-bit binary numbers, the number of different information units that can be represented is two.

conversely, if there are 18 information units to be represented, how many binary digits should be used? If 4-bit binary numbers are used, 16 information units can be represented; If a 5-bit binary number is used, the information that can be represented is 32 units. Therefore, to represent the data of 18 information units, at least 5-bit binary numbers are needed.

when a computer stores data, it often regards an 8-bit binary number as a storage unit or a byte. Use 2 to calculate the storage capacity, and call (i.e. 124) storage units 1kbytes; K (i.e. 124 K) memory cells are called 1mbytes; M (i.e. 124M) memory cells are called 1 gigabyte.

2. Digital representation

Decimal numbers are stored in binary format, that is, numerical data is stored. To represent a numerical data, three problems need to be solved.

first, determine the length of the number. In mathematics, the length of a number generally refers to the number of digits when it is expressed in decimal, for example, 258 is 3 digits and 124578 is 6 digits. In a computer, the length of a number is calculated by the number of binary digits. However, because the storage capacity of a computer is often measured in bytes, the data length is often calculated in bytes. It should be pointed out that in mathematics, the length of numbers varies, and as many digits as possible are written. In the computer, if the length of data varies with the number, it is inconvenient to store or process. Therefore, in the same computer, the length of data is often uniform, and the insufficient part is filled with "".

secondly, there are positive and negative numbers. In a computer, the symbol of a number is always represented by the most significant binary number, and it is agreed that "" represents a positive number and "1" represents a negative number, which is called a number symbol; The rest still represent numerical values. Generally speaking, the number that the sign stored in the machine is digitized is called the machine number, and the number represented by the sign outside the machine is called the truth number. If a number has 8 digits, the truth number is (-111)B, and the number of machines is 1111, as shown in Figure 2.1.1

Figure 2.1.1

The range represented by the number of machines is limited by the word length and the type of data. The word length and data type are determined, and the range that the number of machines can represent is also determined. For example, if an integer is represented, the word length is 8 bits, the maximum value is 1111111, and the highest bit is the sign bit, so the maximum value of this number is 127. If the value exceeds 127, it will "overflow".

and the decimal point. When representing numerical data in a computer, the position of the decimal point is always implied, so as to save storage space. The implied decimal point position can be fixed or variable. The former is called fixed point number, and the latter is called floating point number.

1) Fixed-point representation method:

Fixed-point integer, that is, the decimal point position is agreed after the lowest digit, which is used to represent integer.

integers are divided into signed and unsigned ones. For signed integers, the sign bit is placed in the most significant bit. The number represented by integers is accurate, but the range of numbers is limited. According to the stored word length, they can be represented by 8, 16, 32 bits, etc. See Table 2.1.1 for the range of their respective numbers.

table 2.1.1 expression range of different digits and numbers

expression range of unsigned integer with binary digits

8

16

32

If the length of signed integer is extended to 4 bytes, the expression range of integer can be expanded from 32767 to 2147483647 ≈ . But the storage space occupied by each number has also doubled.

Fixed-point decimal, that is, the decimal point position is agreed before the highest digit, which is used to represent a pure decimal less than 1.

if the decimal decimal fraction is-.6876, it is-.111 ... The binary number of the number -.6876 is an infinite decimal, so only the first 15 digits can be intercepted when storing, and the 16th digit is omitted.

if the length of 2 bytes is used to represent a fixed-point decimal, the weight of the lowest digit is 2-15 (between 1-4 and 1-5), that is, it is at most accurate to the 4th to 5th digits after the decimal point (calculated by decimal system). Such a range and accuracy are difficult to meet the needs even in general applications. To represent a larger or smaller number, it is represented by a floating point number.

2) Floating-point number representation method:

In scientific calculation, in order to represent extra-large or extra-small numbers, "floating-point number" or "scientific representation" is used to represent real numbers, and "floating-point number" consists of two parts, namely mantissa and order code. For example, .23456 is mantissa and 5 is rank code.

in the floating-point representation method, the position of the decimal point is floating, and the rank code can take different values. In order to facilitate the representation of decimal points in computers, it is stipulated that floating-point numbers should be written in a standardized form, that is, the absolute value of mantissa is greater than or equal to .1 and less than 1, thus uniquely specifying the position of decimal points. The length of mantissa will affect the precision of number, and its sign will determine the sign of number. The rank code of floating-point number is equivalent to the exponent in mathematics, and its size will determine the representation range of number.

Similarly, the expression of any binary normalized floating-point number is:

where is the mantissa, and the ""in front of it represents the numeral symbol; It is a rank code, and the ""in front of it indicates a rank symbol. Its storage form in the computer is shown in Figure 2.1.2.

mantissa of order code number

Figure 2.1.2 Storage format of floating point number

For example, let the mantissa be 8 bits and the order code be 6 bits; See Figure 2.1.3 for the storage form of binary numbers and floating-point numbers.

storage of fig. 2.1.3

3) representation of original code, complement and complement

"original code" coding mode

The fixed-point and floating-point representations mentioned above all use the first digit of data to represent the symbol of number, and the subsequent digits to represent the absolute value of number (including mantissa and order code). This method is simple and easy to understand, but the operator must be able to add and subtract, and there are both positive and negative operands, so the original code operation is often accompanied by many judgments. For example, if two numbers are added, if the signs are different, they should actually be subtracted; Subtract two numbers, if the signs are different, actually add them, and so on. As a result, the complexity of the arithmetic unit is increased and the operation time is increased.

"complement" and "complement" coding methods

How to deal with negative numbers? Therefore, coding methods such as "complement" and "complement" are put forward. The main advantage of complement operation is to convert subtraction into addition through proper treatment of negative numbers. No matter sum and difference, no matter whether the operand is positive or negative, all operations are only addition, thus greatly simplifying the addition and subtraction operations. Complement operation is usually realized by complement operation. Therefore, a complete discussion of arithmetic operations should include not only numerical values, but also code systems (original, inverse, complement, etc.).

3. Character representation:

Character encoding refers to the method of representing non-numeric data (such as characters and punctuation marks) with a series of binary numbers, which is called encoding for short. Represents 26 English letters, and 5 binary digits are enough to represent 26 characters. However, each English letter is case-sensitive, and there are a lot of punctuation marks and other special symbols (such as $,#, @,&; ,+,etc.). When all the symbols are counted together, there are always 95 different characters to be represented. The three most widely used encoding methods are ASCII, ANSI and EBCDIC, and the fourth encoding method, Unicode, is under development.

1) ascii (american standard code for information interchange code) is the most widely used. Files encoded with ASCII code are called ASCII files. The standard ASCII coding uses 7 binary numbers to represent 128 symbols, including English uppercase and lowercase letters, punctuation marks, numbers and special control symbols.

2) ansi (American national institute) coding uses 8-bit binary numbers to represent each character. Eight binary numbers can represent 256 information units, so this code can encode 256 characters, symbols, etc. The encoding of the first 128 characters in ANSI is the same as that defined in ASCII, except that a is added to the highest bit. For example, in ASCII coding, the character "a" is represented as 11, while in ANSI coding, it is represented as 11. In addition to representing 128 characters in ASCII code, there are 128 symbols in ANSI code, such as copyright symbol, pound symbol and foreign language characters.

3) ebcdic (extended binary-coded decimal interchange code) is an 8-bit character code developed by IBM for its mainframe. It is worth noting that in the first 128 characters of EBCDIC encoding, the encoding of EBCDIC is different from that of ASCII or ANSI.

generally speaking, 128 characters defined by the standard ASCII code are enough for representing numbers, characters, punctuation marks and special characters. The ANSI code represents 128 characters represented by all ASCII codes, and also represents characters in European languages. EBCDIC coding represents standard characters and control codes. However, no coding scheme supports optional character sets, nor does it support languages that are not composed of letters, such as Chinese and Japanese.

4)Unicode encoding is a set of 16-bit encoding, which can represent more than 65, different information units. In principle, Unicode can represent characters in any language that is currently in use or no longer in use. For international business and communication, this coding method is very useful, because a file may need to contain different languages such as Chinese, Japanese and English. Moreover, Unicode coding is also suitable for the localization of software, that is, the software can be modified for a specific country. In addition, using Unicode coding, software developers can modify screen prompts, menus, error message prompts, etc. to adapt to languages and characters in different countries.

2.1.2 Representation of image data and video data

There are two very different graphic coding methods, namely bitmap coding and vector coding. The difference between the two encoding methods affects the quality of the image, the space for storing the image, the time of image transmission and the difficulty of modifying the image. Video is a kind of image data, which is formed by continuous playing of several related image data. The video signal that people generally talk about is TV signal, which is analog; While computer video signals are digital quantities.

1. Bitmap image:

Bitmap image stores the image according to the pixel position on the screen. The simplest bitmap image is a monochrome image. Monochrome images have only two colors: black and white. If the corresponding image unit on a pixel is black, it will be represented by in the computer. If it corresponds to white, it is represented by 1 in the computer.

for monochrome images, the number of image units used to represent full-screen images is exactly equal to the number of pixels on the screen. If the horizontal resolution is 64 and the vertical resolution is 48, multiply the horizontal resolution and vertical resolution of the screen: 64×48=372, then the number of pixels on the screen is 372. Because a monochrome image uses a binary number to represent a pixel, the number of bytes for storing a full-screen bitmap image can be calculated: 372 ÷ 8 = 384. But monochrome images look unreal and are rarely used.

grayscale images look more realistic than monochrome images. Gray image shows the image in proportion with gray. The more gray levels are used, the more realistic the image looks. Usually, computers use 256 gray levels to display images. In a 256-level gray image, each pixel can be white, black or any of the 256 levels of gray, that is, each pixel has 256 possibilities of information representation. Therefore, in a gray-scale image, it takes 256 information units to store a pixel image, that is, one byte of storage space is needed. Therefore, a full-screen gray-scale image with a resolution of 64×48 needs 37,2 bytes of storage space.

computers can display color images in 16,256 or 16.7 million colors, and users will get more realistic images.

in a 16-color image, each pixel can have 16 colors. Then in order to represent 16 different information units, each pixel needs 4 bits of binary number to store information. Therefore, a full-screen 16-color bitmap image needs a storage capacity of 153,6 bytes.

a bitmap image with 256 colors, and each pixel can have 256 colors. In order to represent 256 different information units, each pixel needs 8-bit binary numbers to store information, that is, one byte. Therefore, a full-screen 256-color bitmap image needs a storage capacity of 37,2 bytes, which is twice that of 16 colors and the same as that of a 256-level gray image.

a bitmap image with 16.7 million colors is called a 24-bit image or a true color image. Each pixel can have 16,7 colors. In order to represent these 16.7 million different information units, each pixel needs 24 bits of binary number to store information, that is, 3 bytes. Obviously, a full-screen true color image needs more storage capacity.

files containing images are very large, which requires a large amount of memory to store, and the transmission and download time is also very long. For example, it takes at least 1 minute to download a 256-color image with a resolution of 64×48 from the Internet. A 16-color image takes half the time; But a true color image will take more time.

there are two technologies that can be used to reduce the storage space and transmission time of images, namely data compression technology and image dithering technology. The data compression technology is introduced later, and the image dithering technology mainly reduces the file storage capacity by reducing the number of colors in the image. Jitter technology