CSE/Computer Architecture 검색 결과

7개 발견
  1. 미리보기
    2015.06.12 - Palpit

    [Computer Architecture] DMA(Direct Memory Access)

  2. 미리보기
    2015.06.12 - Palpit

    [Computer Architecture] FireWire(IEEE 1394)

  3. 미리보기
    2015.06.12 - Palpit

    [Computer Architecture] InfiniBand

  4. 미리보기
    2015.06.12 - Palpit

    [Computer Architecture] Calculate CPI, MIPS, Execution time

  5. 미리보기
    2015.06.12 - Palpit

    [Computer Arcitecture] Calculate CPI

  6. 미리보기
    2015.06.12 - Palpit

    [Computer Architecture] Moore's Law

  7. 미리보기
    2015.06.12 - Palpit

    [Computer architecture] 컴퓨터 구조 정리

조회수 확인
DMA(Direct Memory Access)
 
Direct memory access (DMA) is a feature of computerized systems that allows certain hardware subsystems to access main system memory independently of the CPU.

Without DMA, when the CPU is using programmed I/O, it is typically fully occupied for the entire duration of the read or write operation, and is thus unavailable to perform other work.  
 
With DMA, the CPU initiates the transfer, does other operations while the transfer is in progress, and receives an interrupt from the DMA controller when the operation is done. This feature is useful any time the CPU cannot keep up with the rate of data transfer, or where the CPU needs to perform useful work while waiting for a relatively slow I/O data transfer.  
 
Many hardware systems use DMA, including disk drive controllers, graphics cards, network cards and sound cards. DMA is also used for intra-chip data transfer in multi-core processors. Computers that have DMA channels can transfer data to and from devices with much less CPU overhead than computers without DMA channels. Similarly, a processing element inside a multi-core processor can transfer data to and from its local memory without occupying its processor time, allowing computation and data transfer to proceed in parallel.
 
Modes of operation
Burst mode
An entire block of data is transferred in one contiguous sequence. Once the DMA controller is granted access to the system bus by the CPU, it transfers all bytes of data in the data block before releasing control of the system buses back to the CPU, but renders the CPU inactive for relatively long periods of time. The mode is also called "Block Transfer Mode". It is also used to stop unnecessary data.

Cycle stealing mode
The cycle stealing mode is used in systems in which the CPU should not be disabled for the length of time needed for burst transfer modes. In the cycle stealing mode, the DMA controller obtains access to the system bus the same way as in burst mode, using BR (Bus Request) and BG (Bus Grant) signals, which are the two signals controlling the interface between the CPU and the DMA controller. However, in cycle stealing mode, after one byte of data transfer, the control of the system bus is deasserted to the CPU via BG. It is then continually requested again via BR, transferring one byte of data per request, until the entire block of data has been transferred. By continually obtaining and releasing the control of the system bus, the DMA controller essentially interleaves instruction and data transfers. The CPU processes an instruction, then the DMA controller transfers one data value, and so on. On the one hand, the data block is not transferred as quickly in cycle stealing mode as in burst mode, but on the other hand the CPU is not idled for as long as in burst mode. Cycle stealing mode is useful for controllers that monitor data in real time.

Transparent mode
The transparent mode takes the most time to transfer a block of data, yet it is also the most efficient mode in terms of overall system performance. The DMA controller only transfers data when the CPU is performing operations that do not use the system buses. It is the primary advantage of the transparent mode that the CPU never stops executing its programs and the DMA transfer is free in terms of time. The disadvantage of the transparent mode is that the hardware needs to determine when the CPU is not using the system buses, which can be complex.
 
 
참고:
 1) http://en.wikipedia.org/wiki/Direct_memory_access


다른 카테고리의 글 목록

CSE/Computer Architecture 카테고리의 포스트를 톺아봅니다
조회수 확인

FireWire(IEEE 1394)

 

IEEE 1394, 혹은 파이어와이어(FireWire), 아이링크(i.Link)는 미국의 Apple이 제창한 Personal Computer 및 digital audio, digital vedio용 Serial Bus Interface 표준 규격이다. IEEE 1394는 데이터의 고속 전송과 등시성 실시간 데이터 서비스를 지원한다. IEEE 1394는 낮은 단가와, 간단하고 융통성 있는 케이블 시스템 덕에 Paralell SCSI를 대체하였다.

 

특징은 다음과 같다.

 

 - 디지털 인터페이스의 표준: 반복적인 디지털 대 아날로그의 변환에 따라 발생하는 지속적인 신호의 감쇠를 처리  

 - 빠르고 용이한 전환 : SCSI와 달리 장치의 설치와 제거가 컴퓨터 구동 중에도 용이

 - 사용의 편리성 : 케이블 장착만으로 플러그 앤 플레이를 통해 장치와 컴퓨터와 연결, 빠른 속도 등이다.

 

IEEE 1394의 규격은 파이어와이어 400과, 800이 있는데 각각 약 100/200/400Mbps, 800Mbps의 전송 속도를 지원한다. 두 규격은 모두 Hot-plugging을 지원한다. IEEE 1394의 두 형태는 6Pin(전기공급:2Pin, 데이터전송:4Pin)과 4Pin(데이터전송:4pin)으로 구성되어 있다.

 

High performance serial bus

Fast, low cost

Easy to implement

 

FireWire Configuration

Daisy Chain

Up to 63 devices on single port  

 - Really 64 of which one is the interface inself

Up to 1022 buses can be connected with bridges

Automatic configuration

No bus terminators

May be tree structure

 





example firewire configuration

 

FireWire 3 Layer Stack

-Physical

 Transmission medium, electrical and signaling characteristics

 

 Data rates from 25 to 4000Mbps

 Two forms of arbitration

  - Based on tree structure

  - Root acts as arbiter

  - First come first served

  - Natural priority controls simultaneous requests

  - Fair arbitration

  - Urgent arbitration

 

-Link

 Transmission of data in packets

 Two transmission types

  - Asynchronous

   Variable amount of data and several bytes of transcation data transferred as a packet

   To explicit address

   Acknowledgement returned

   

  - Isochronous(등시성)

   Variable amount of data in sequence of fixed size packets at regular intervals(Streamming Service needed)

   Simplified addressing

   No acknowledgement 

-Transaction 

 Request-response protocol

 

 

 

 

참고:

 1) http://quarantinecrews.blogspot.kr/2013/12/inputoutputproblems-computers-have-wide.html 

 

다른 카테고리의 글 목록

CSE/Computer Architecture 카테고리의 포스트를 톺아봅니다

[Computer Architecture] InfiniBand

2015.06.12 15:57 - Palpit
조회수 확인
InfiniBand

 

인피니밴드(InfiniBand)는 High-Performance Computing과 기업용 데이터 센터에서 사용되는 스위치 방식의 통신 연결 방식이다.  

 

주요 특징으로는 높은 Through-put과 낮은 Latency ,높은 안정성 & 확장성을 들 수 있음

 

Computing Node와 Storage Device와 같은 고성능 I/O 장비간의 연결에 사용됨

 

 -> processor 와 intelligent I/O Devices 간의 dataflow를 위한 구조

 

PCI를 대체하기위해 나옴

 

Capacity, Expandability, Flexibility 를 늘리기 위함

 -> Reliability, Availability, Serviceability for Internet infrastructures

 

Remote storage, networking and connection between servers

 - Attach servers, remote storage, network devices to central fabric of switches and links

 - Both for PCB and "out of box" interconnect

 

Greater server density(removing I/O from the server chassis)  

 - Blade server  

 

I/O distance from server up to(전송매체에 따라 상이함) 

 - 17m using copper, 300m multimode fibre optic, 10km single mode fibre 

 

Up to 120 Gbps signalling rate (2.5*12*4)

 - Data rate 2.5 G * 0.8(data 8B / 10B encoding) * 4(QDR) = 96Gb/sec (신호특성을 바꿔 에러줄임)

 - Implementers can aggregate links in units of 4 or 12, called 4X or 12X

 - A 12X QDR link therefore carries 120 Gbit/s raw, or 96 Gbit/s of useful data

 - As of 2009 most systems use a 4X aggregate, implying a 10Gbit/s(SDR), 20Gbit/s(DDR) or 40Gbit/s(QDR) connection

 - Larger systems with 12X links are typically used bfor cluster and supercomputer interconnects and for inter-switch connections

 











The end-toend latency range

 - MPI latency : 1.07 - 1.29 microseconds MPI latency

 - Provides RDMA(Remote node R/W via DMA) capabilities for low CPU overhead 

 

InfiniBand Fabric: 연결이 여러가지 형태로 되어있음 => fail-over scalability 



 

 

 

참고:  

 1) http://www.eetimes.com/document.asp?doc_id=1204375

Crafting Native Software Stacks for InfiniBand Designs | EE Times
www.eetimes.com
본문으로 이동

 


다른 카테고리의 글 목록

CSE/Computer Architecture 카테고리의 포스트를 톺아봅니다
조회수 확인


다른 카테고리의 글 목록

CSE/Computer Architecture 카테고리의 포스트를 톺아봅니다

[Computer Arcitecture] Calculate CPI

2015.06.12 15:56 - Palpit
조회수 확인


다른 카테고리의 글 목록

CSE/Computer Architecture 카테고리의 포스트를 톺아봅니다

[Computer Architecture] Moore's Law

2015.06.12 15:55 - Palpit
조회수 확인

Moore’s Law is a computing term which originated around 1970; the simplified version of this law states that processor speeds, or overall processing power for computers will double every two years.  A quick check among technicians in different computer companies shows that the term is not very popular but the rule is still accepted. 

[출처]: http://www.mooreslaw.org/


 

 

무어의 법칙 3가지 조건

  1. 반도체 메모리칩의 성능 즉, 메모리의 용량이나 CPU의 속도가 18개월에서 24개월마다 2배씩 향상된다는 '기술 개발 속도에 관한 법칙'이다.
  2. 컴퓨팅 성능은 18개월마다 2배씩 향상된다.
  3. 컴퓨팅 가격은 18개월마다 반으로 떨어진다.


다른 카테고리의 글 목록

CSE/Computer Architecture 카테고리의 포스트를 톺아봅니다
조회수 확인




프로세서는 클럭에 의해 구동

 

클록은 일정 주파수 f 혹은 일정 사이클 시간 t(단, t = 1/f)을 가짐

 

Ic = Instruction Count, 명령어 카운트 = 그 프로그램이 종료되거나 어던 정해진 시간 간격 동안 실행된 기계 명령어들의 수로 정의

                                                       실행된 명령어들의 수

 

CPI(Cycle Per Instruction) : 프로그램에 대한 명령어당 평균 사이클 수 

 

주어진 프로그램을 수행하는 데 필요한 프로세서 시간 T = Ic * CPI * t

 

프로세서의 성능을 나타내는 데 보편적으로 사용되는 척도는 명령어가 실행되는 율 = MIPS = Ic  / T * 10 ^ 6 = f / CPI * 10 ^ 6

 

Example 1) 400 MHz Processor , 2 million instrctions, Calculate MIPS rate.

  f = 400Mhz = 400 * 10 ^ 6 

  Ic = 2 * 10 ^ 6

 

Instruction Type  

CPI 

Instruction Mix (%)  

 Arithmetic and logic 

 

60 

 Load/store with cache hit 

 2 

18 

 Branch 

 4 

12 

 Memory reference with cache miss 

 8

10

 

average CPI = (1*0.6) + (2*0.18) + (4*0.12) + (8*0.1) = 2.24 

 

* CPI rate = f / CPI * 10 ^ 6 = 400 * 10 ^ 6 / 2.24 * 10 ^ 6  = 178

 

 

 







 

 

Example 2) Four benchmark programs are executed on three computers with the following results:

 

 Computer A

Computer B 

Computer C 

Program1 

10 

100 

Program 2 

500 

100 

20 

Program 3

500

1000 

50 

Program 4

100 

800 

50 

 

The table shows the execution time in seconds, with 100,000,000 instructions executed in each of the four programs. Calculate the MIPS values for each computer for each program. 

 

Ic = 10 ^ 8

MIPS = Ic / T * 10 ^ 6 = 10 ^ 8 / T * 10 ^ 6 = 100 / T

 

MIPS rate

 

 Computer A

Computer B 

Computer C 

Program1 

100 / 1 = 100

100 / 10 = 10

100 / 100 = 1

Program 2 

100 / 500 = 0.2

100 / 100 =  1

100 / 20 = 5

Program 3

100 / 500 = 0.2

100 / 1000 = 0.1

100 / 50 = 2

Program 4

100 / 100 = 1

100 / 800 = 0.125

100 / 50 = 2

 

 

 

 

[참고서] : Computer Organization & Architecture: designing and performance, 8/E, William Stallings 


다른 카테고리의 글 목록

CSE/Computer Architecture 카테고리의 포스트를 톺아봅니다