Shark SK435CO Guide de l'utilisateur

Naviguer en ligne ou télécharger Guide de l'utilisateur pour Nettoyeurs à vapeur Shark SK435CO. Shark SK435CO User guide Manuel d'utilisatio

  • Télécharger
  • Ajouter à mon manuel
  • Imprimer
  • Page
    / 330
  • Table des matières
  • MARQUE LIVRES
  • Noté. / 5. Basé sur avis des utilisateurs

Résumé du contenu

Page 1 - Shark User Guide

Shark User Guide(Legacy)

Page 2 - Contents

Figure 6-22 Data Mining Contextual Menu 168Figure 6-23 After Focus Symbol -[SKTGraphicView drawRect:] 169Figure 6-24 After focus and expansion 170Figu

Page 3 - System Tracing 63

few disclosure triangles open, this view lets you logically follow your code paths until you reach a point wherethey call blocking library routines. A

Page 4

Note regarding launched target processes: When launching a process (as described in ProcessLaunch (page 122)) with Time Profile (All Thread States) ,

Page 5

3.Start Delay— Specify a length of time that Shark should wait after being told to start collecting a profilebefore the collection begins. If the prog

Page 6

is often a good idea to look over the routines near the top of this list and make sure that both the routinesallocating memory are the ones you think

Page 7 - Figures, Tables, and Listings

allocation and deallocation operations outside of loops, so that you can reuse the same memory buffersrepeatedly without reallocating them each time t

Page 8

Advanced Display OptionsEach Malloc Trace records a few additional pieces of information at each allocation event. These are notdisplayed by default,

Page 9

When you enable display of a particular type of data, it will appear in several places. First, columns displayingit will appear in the Profile Browser

Page 10 - Custom Configurations 183

Static AnalysisMost of Shark’s profiling methods limit their code analysis to those functions that appear dynamically infunctions that are executed du

Page 11 - Miscellaneous Topics 242

●PowerPC Model — Selects the PowerPC model to use when searching for and assigning problemseverities . ●Intel Model— Selects the Intel model to use w

Page 12 - SwiftObjective-C

Sun’s Java virtual machine included with Mac OS X do provide an interface that Shark can use. As a result,Shark includes some special, Java-only confi

Page 13 - Introduction

Figure 8-8 PowerPC 970 IMC (IFU) Configuration Tab 217Figure 8-9 PowerPC 970 IMC (IDU) Configuration Tab 221Figure 8-10 U1.5/U2 Configuration Tab 223F

Page 14 - Organization of This Document

●Java Alloc Trace: This records memory allocations and the sizes of the objects allocated, and is analogousto a regular Malloc Trace (Malloc Trace (p

Page 15

Event Counting and Profiling OverviewAfter analyzing an application using a Time Profile, you may find it informative to count system events or evensa

Page 16

Counter Spreadsheet Advanced Settings (page 116)). Selecting rows in this list also selects the correspondingcolumns in the counter table, graphing th

Page 17 - Getting Started with Shark

e.Shortcut Result Column(s)— These columns show the performance counter results after they havebeen processed by the math in any “shortcut” equations.

Page 18 - Figure 1-2 Process Target

see Adding Shortcut Equations (page 119), below. For a complete description of how to write performancecounter equations, including how to add them pe

Page 19 - Perform Sampling

The Counters MenuWhen you switch to the Counters tab in a session made with timed performance counters, a Counters menuwill appear in the menu bar. Yo

Page 20 - Session Windows and Files

Performance Counter Spreadsheet Advanced SettingsWith the session window in the foreground, select WindowShow Advanced Settings (Command-Shift-M ), a

Page 21 - Session Files

This drawer contains three main panels, each with many different controls that affect the presentation ofresults:1.Counter Shortcut Equations— This ta

Page 22 - Session Information Sheet

●Bars — Display results using a vertical bar chart. Bars from multiple selected columns will besuperimposed over one another. ●Stacks of Bars— Displa

Page 23 - Advanced Settings Drawer

Adding Shortcut EquationsThis section gives a brief summary of how to add new “shortcut equation” results columns to your performancecounter spreadshe

Page 24 - Shark Preferences

SwiftObjective-CRetired Document | 2012-07-23 | Copyright © 2012 Apple Inc. All Rights Reserved.12

Page 25 - Figure 1-6

Note: The built-in L2 cache miss profile configuration is a great way to find lines in your code thataccess memory in ways that cause very slow L2 ca

Page 26 - Shark Preferences — Sampling

While none of the default configurations use this capability, it is also possible to essentially record callstackslike a Time Profile simultaneously w

Page 27 - Shark Preferences — Sessions

Although the Start button makes starting and stopping Shark quite simple, sometimes it can be impractical,or even impossible to use. For example, how

Page 28 - Figure 1-9

Process Attach mode, by selecting Process from the Target popup (Command-2). Now, select the “Launch...”target from the top of the process list (or us

Page 29 - Time Profiling

2.Working Dir— The full path to the working directory that the application will start using. By default, thisis the path where the executable is locat

Page 30 - Figure 2-2 Sampling Results

Batch ModeBatch mode queues up any sessions recorded without displaying them. Pending sessions are listed in the mainShark window. Batch mode allows m

Page 31 - Taking a Time Profile

leaving the area of interest. However, you may not always know when you will encounter the “interesting”region of your program in advance. WTF mode si

Page 32 - Profile Browser

Tracing the execution around an asynchronous event, such as inter-thread communication, the arrival of anetwork packet, or OS event such as a page fau

Page 33

Second, the beginning of a WTF System Trace Timeline (see Figure 5-6) can appear a bit strange; differentprocessors might first appear at vastly diffe

Page 34 - Tuning Advice

Sampling UnresponsiveApplications menuitem(Command-Shift-A ).WhenUnresponsiveApplicationTriggeringis enabled, Shark will automatically switch to Batc

Page 35 - Callstack Table

Important: This document may not represent best practices for current development. Links to downloadsand other resources may no longer be valid.Overv

Page 36 - Tree View

general, it is intended that you use it to collect sessions and then review your results with a graphical copy ofShark later. This section will discus

Page 37 - Figure 2-8 Tree Profile View

Remote ModeA third way to use command line shark is remote mode, which works much like the remote mode supportedby graphical Shark and described in In

Page 38 - Profile Display Preferences

●Time Interval — shark -I allows you to change the sampling interval for configurations that support asampling interval. Valid times are entered the

Page 39

ReportsCommand line shark supports generation of textual reports, either from session files that you’ve already created,or from new sessions as they a

Page 40

More InformationThis section has presented some of the most common options and techniques for using command-line shark.For more detailed information o

Page 41

It is important to keep in mind that many profiling techniques used by Shark employ statistical sampling inorder to generate a profile. If the samplin

Page 42

sprintf(label_str, "Hanoi #%d", i);chudStartRemotePerfMonitor(label_str);Hanoi('A','B','C',i);chudStopRemotePe

Page 43 - Advanced Chart View Settings

The Towers of Hanoi test program demonstrates the need for a sampling interval that is much shorter thanthe time between the calls to start and stop S

Page 44

When used to stop profiling, chudRemoteCtrl will not return until Shark has stopped profiling. In the caseof command-line shark, chudRemoteCtrl will n

Page 45 - Code Browser

Important: Shark cannot capture symbol information on the iPhone itself, so “raw” sessions recorded froman iPhone will appear in Shark labeled only b

Page 46

2.It must be relevant. Optimizing functionality that is rarely used is usually counter-productive.3.It shows up as a hot spot in a time profile. If th

Page 47 - Figure 2-13

●Control network profiling of shared computers — Any computers on the network (in the local domain)running Shark in “shared” network mode will automa

Page 48 - Assembly Browser

●Config— The currently active Sampling Configuration on the shared computer. The entries in thiscolumn are menus, just like the one in Shark’s main w

Page 49

then respond to network requests to start and stop profiling. A sample transcript of a remote command lineshark in “Network Sharing” mode is shown in

Page 50

Mac OS X Firewall ConsiderationsThe sharing firewall on Mac OS X can prevent Shark’s network profiling from working in either sharing orcontrol mode.

Page 51

Click the Sharing... button in the warning dialog to bring up the System Preferences window Sharing tab.Otherwise click the Ignore button to dismiss t

Page 52

SwiftObjective-COften, the profile analysis windows can provide you with a very helpful view of your application’s behaviorusing the default settings.

Page 53

If symbol lookup fails, Shark may present the missing “symbols” in two different ways. If the memory of theprocess is readable — for example, a binary

Page 54 - ISA Reference Window

require debugging information to work, but it can be much more helpful if it’s available. In case you record aShark session and discover that symbols

Page 55

No matter which way you choose to get here, you will be presented with a Symbolication dialog (Figure 2-20).Figure 6-2 Symbolication DialogUse this di

Page 56 - Tips and Tricks

Shark will warn you if you select a binary that is potentially problematic. If you do happen to select an executablethat isn’t a good match, the profi

Page 57

●Getting Started with Shark— This introduction and Getting Started with Shark (page 17) are designedto give you an overall introduction to Shark. Aft

Page 58

Figure 6-4 After SymbolicationManaging SessionsIf you have multiple sessions measuring the same application, it is possible to use Shark to compare or

Page 59

When used, a new session is created from two existing ones: Session A and Session B. The first session (SessionA) is given a negative scaling factor,

Page 60 - Vectorization

Callstack Data MiningIn order to understand how to use data mining to better understand your application, it is necessary to firstunderstand a few fun

Page 61 - 1.00xOriginal

large routines farther down the callstack that call many other routines in the course of their execution. Onceyou have a clear picture of how callstac

Page 62 - 5.69xAll Vector

Figure 6-7 Tree ViewmainTotal:Self:50fooTotal:Self:20barTotal:Self:20cosTotal:Self:11sqrtTotal:Self:11bazTotal:Self:30barTotal:Self:11sqrtTotal:Self:1

Page 63 - System Tracing

in controlled ways. For example, you often won’t care about the exact places that samples occur within MacOSX’s extensive libraries — only which of yo

Page 64 - Basic Usage

a flag such as ‘–g’ with GCC or XLC, and in the process eliminating a lot of user-level code that you probablydo not have control over. Samples from c

Page 65

9.Focus Callers of Symbol X — Removes functions called by the specified symbol and removes callstacksthat do not contain the specified symbol.10.Focus

Page 66 - Interpreting Sessions

The Perf Count Data Mining palette also supplies a global enable/disable toggle, much like the one availablewith conventional data mining, and check b

Page 67 - Summary View In-depth

2.Make four shapes as shown in Figure 6-11Figure 6-11Example Shapes3.Repeat the following steps until the app becomes sluggish (takes a half second or

Page 68 - Scheduler Summary

Counter Event List (page 252), PPC 750 (G3) Performance Counter Event List (page 263), PPC 7400 (G4)Performance Counter Event List (page 265), PPC 745

Page 69 - System Calls Summary

This should take 8-10 times (maybe more) depending on hardware. When you are done it should looksomething similar to Figure 6-12Figure 6-12Example Sha

Page 70

This reveals a third pop-up button that you can use to target your application. Select Sketch from the listof running applications.Figure 6-13Sampling

Page 71

High Level AnalysisThe session window gives you by default a summary of all the functions that the sampler found samples inand the percentage of the s

Page 72

3.Click on the callstack button on the lower right corner of the table to reveal the callstack pane, asshown in Figure 6-15. As you click on symbols o

Page 73 - Trace View In-depth

2.Double click on the symbol -[SKTGraphicView selectAll:] in the tree view above. You will see asource window that looks like Figure 6-17Figure 6-17So

Page 74 - System Call Trace

3.Double-click on the yellow colored line to navigate to the function (performSelector) called here. Whenthe new source window comes up, double-click

Page 75 - VM Fault Trace

4.Double-click on the yellow colored line [self performSelector: sel withObject:[arrayObjectAtIndex:i]]; and you'll get Figure 6-19:Figure 6-19So

Page 76

5.Double-click on [self invalidateGraphic:graphic]; and you'll get Figure 6-20. This contains oneline of expensive code that tests for nested obj

Page 77 - Timeline View In-depth

Introduction To FocusingThis example will take us through analyzing the behavior of drawing the selected rectangles. Here, we willdevelop ideas for an

Page 78 - Figure 3-11 Timeline View

5.Choose "Focus Symbol -[SKTGraphicView drawRect:]" and you will get something that looks likeFigure 6-23Figure 6-23After Focus Symbol -[SKT

Page 79 - Thread Run Intervals

Starting to use Shark is a relatively simple process. You only need to choose one or two items from menus andpress a big “Start” button in order to st

Page 80 - System Calls

6.Expand -[SKTGraphicView drawRect:] in the bottom outline a few times until it looks likes likeFigure6-24:Figure 6-24After focus and expansionThere a

Page 81

7.Double click on -[SKTGraphic drawInView:isSelected] to see the source, as shown in Figure 6-25:Figure 6-25Source View: SKTGraphic drawInView:isSelec

Page 82 - VM Faults

8.Double click on line 406 on the text -[self drawHandlesInView: view] and you'll get Figure 6-26:Figure 6-26Source View: SKGraphic drawHandlesIn

Page 83

9.Double click on line 502 in the text [self drawHandleAtPoint: ...] and it will take you to the codefor [SKTGraphicview drawHandleAtPoint: ...] which

Page 84 - Interrupts

2.We're going to work with the “Heavy View” (the upper profile) for a bit. So click theand set it back to .3.Select the first symbol in the upper

Page 85 - Sign Posts

5.In the left hand outline select the symbol ripd_mark and control+click on it to bring up the data miningcontextual menu. Choose "Charge Library

Page 86

This example is a bit simplistic, but it shows the power of the exclusion operations to strip out unnecessaryinformation and identify where the real c

Page 87

3.Target your application and choose “Malloc Trace” instead of “Time Profile,” as with Figure 6-32.Figure 6-32Malloc Trace Main Window4.Switch back to

Page 88

The window should look like Figure 6-33, if you have gone through Tutorial 1 first. Otherwise, it will look similarbut not exactly the same.Figure 6-3

Page 89

Graphical Analysis of a Malloc Trace1.Click on the Chart Tab and you'll get a window that looks like Figure 6-34.Figure 6-34Chart ViewThe lower g

Page 90

●Malloc Trace— If your program allocates and deallocates a lot of memory, performance can suffer andthe odds of accidental memory leaks increase. Sha

Page 91

2.Select the first hump just before sample 6,000 and enlarge it, as shown in Figure 6-35:Figure 6-35Place to SelectThe yellow indicates the tenure of

Page 92 - Listing 3-2 signPostExample.c

3.Now use the slider on the bottom left of the window to adjust zoom. Play with this a bit. As you zoom inand out you'll see that there are multi

Page 93

4.We'll finish up with another good application of this graphical analysis. Click on the call stack buttonto reveal the call stack for this sampl

Page 94 - ~/Library/Application

Up until now, you have been using the configuration menu in Shark’s main window (in Figure 7-1) to selectfrom various built-in sampling methods. Each

Page 95

The Config EditorThe Configuration Editor lets you individually modify settings for any of Shark’s modules, which are calledPlugIns. The properties av

Page 96

●You can Rename any custom config in the list, but not built-in config files. A renamed config will bechanged in the appropriate Configs folder immed

Page 97

●In Advanced mode, all of the available plugins are listed with a checkbox next to each indicatingwhether or not it is enabled in the current config.

Page 98

●Sampling Tab – The controls on this tab (see Figure 7-3) determine when to start and stop recordingsamples.1.Windowed Time Facility— If enabled, Sha

Page 99

column to select the performance counter mode (None, Counter, or Trigger). Only a small subset ofpossible counter options are available here. For more

Page 100

Malloc Data Source PlugIn EditorThe Malloc data source is used for the Malloc Trace config described in Malloc Trace (page 101). It is used forcollect

Page 101 - Malloc Trace

Mini Configuration EditorsEach configuration typically has a few parameters that are frequently modified. Shark allows you to edit theseeasily using t

Page 102 - Using a Malloc Trace

Static Analysis Data Source PlugIn EditorThe Static Analysis data source is used by the Static Analysis default configuration, described in StaticAnal

Page 103 - Figure 4-5

4.Processor Settings— Shark needs to know which model of processor is your target before it can examinecode and find potential problems. Separate menu

Page 104 - Figure 4-6

Sampler Data Source PlugIn EditorThe Sampler data source provides the same functionality as the separate Sampler application and command-linetool. It

Page 105 - Advanced Display Options

System Trace Data Source PlugIn EditorThis data source collects data for the System Trace default configuration, described in System Tracing (page63).

Page 106

All Thread States Data Source PlugIn EditorThis data source collects data for the Time Profile (All Thread States) default configuration, described in

Page 107 - Static Analysis

Analysis and Viewer PlugIn SummaryAll Data Source PlugIns include configuration editors. However, most of the analysis and viewer editors do not.While

Page 108

●System Trace: Timeline— This can only be used with the “System Trace” data source and analysis PlugIns.It displays the Timeline tab used by System T

Page 109 - *AVA#LASSES

This view contains the following constituent parts:1.PMC Sumary Table – This table summarizes all the performance counters (PMCs) that are currently s

Page 110

DescriptionShortcutEquationTermsRepresents a summation of results from all processors on counter-Y. For example: pNc1is the term that represents event

Page 111

Spreadsheet Configuration ExampleBecause this editor is very flexible and powerful, an example can be helpful to illustrate how it might be used.Start

Page 112

ContentsIntroduction 13Overview 13Philosophy 13Organization of This Document 14Getting Started with Shark 17Main Window 17Mini Configuration Editors 1

Page 113

Note: Occasionally you may notice a small delay while Shark allocates the sample buffers it needsto record data, due to time spent in the Mac OS X vi

Page 114

2.Next search the list by typing “INST” into the search field, as is shown in Figure 7-13. Select the“INST_RETIRED” entry and change the mode to “Coun

Page 115 - The Counters Menu

Next, enter the equation pNc3/pNc2, as is shown in Figure 7-14. This will automatically calculate the numberof cycles per completed instruction, or CP

Page 116

The different CPUs and North bridge chipsets available in Macintosh systems have widely varying performancemonitoring capabilities. Because there are

Page 117

Once you have decided which counters you want to measure, and thought a bit about how you might wantto control sampling, there are several configurati

Page 118

●Sample Limit — Sets the maximum number of samples to record. Specifying a maximum of N sampleswill result in at most N samples being taken on a unip

Page 119 - Adding Shortcut Equations

●chudRecordUserSample— A sample is recorded for every call to theCHUD.frameworkchudRecordUserSample() function. This is analogous to using signposts

Page 120 - Event SamplingTimer Sampling

Common Elements in Performance Counter Configuration TabsAll of the various performance counter configuration tabs have many unique elements, as the v

Page 121

3.Sample Interval— This is the number of events that must occur before this PMC will trigger sampling. Itis ignored unless this particular counter has

Page 122 - Advanced Profiling Control

You can mark processes with Shark’s Process Marker (Figure 8-3). The Process Marker can be opened via theSampling Mark Process menu item. Shark disab

Page 123 - Process Launch

●Scheduler Events: Events such as context switches, “thread ready” events, and stack handoffs ●Disk I/O Events: Disk reads and writes, with optional

Page 124

Shark allows you to work with multiple sampling sessions at a time, displaying a separate window for eachsession. This is useful for comparing two or

Page 125 - Windowed Time Facility (WTF)

both count a similar but not identical list of events on the programmable processors. Full event listings areprovided in Intel Core Performance Counte

Page 126

bit-names in the mask list. Any bit in the list labeled *Reserved* should not be enabled. A brief summary ofwhich bits are active for any particular e

Page 127 - WTF with System Trace

Figure 8-6 shows the single configuration tab for the G4+ processor (the one for the G3 and G4 is virtuallyidentical, but lacks PMCs 5–6). For the mos

Page 128

Warning:If you leave branch folding disabled and exit Shark, branch folding will remain disabled. Whilethis will not cause any correctness problems or

Page 129 - Command Line Shark

In addition, several additional controls are provided. Most are multiplexer controls to switch the various eventpre-filtering multiplexers, but the la

Page 130 - Basic Methodology

8.TB Select: This is the divider used for timebase events that cause processor exceptions, and selects fromfour different division ratios. More inform

Page 131 - Common Options

to count it. Please note that as long as an instruction resides in the L1 instruction cache, its match bit willremain unchanged. Hence, if the match c

Page 132 - Target Selection

Due to the very flexible and complex nature of these mechanisms, it is highly recommended that you read thepertinent sections of the PowerPC 970 Docum

Page 133 - Custom Configurations

2.IOP Marking – This pre-filter will limit the type of internal PowerPC microinstructions (IOPs) that arematched or sampled. ●All IOPs – (default) Any

Page 134 - Interprocess Remote Control

5.Major Opcode Bits— This allows you to select marked instructions on the basis of their six major opcodebits (bits 0–5 of each PowerPC instruction).

Page 135 - Example: Towers of Hanoi

Note: Shark’s session files have slowly evolved and changed over time, as new features have beenadded that made it difficult to keep backwards-compat

Page 136

●BSFL column — This lists the BSFL (Branch instruction, instruction that will be Split, First instructionin a dispatch group, and Last instruction in

Page 137 - Samples Taken

●1 — Match this bit position with 1. Normally only desired if the corresponding IMRMASK bit is 1, or ifyou want to intentionally match nothing.Figure

Page 138 - Network/iPhone Profiling

settings, we strongly suggest that you start using the “Simple” settings at first, as described in Simple TimedSamples and Counters Config Editor (pag

Page 139

4.FireWire/Enet— The dedicated FireWire and Ethernet I/O portsFigure 8-10 U1.5/U2 Configuration TabU3 North BridgeThis section describes how you can m

Page 140

b.Write— Only store requests to memory can increment the counter.c.Read — Only load requests from memory can increment the counter.d.Any — All memory

Page 141 - Using Shared Profiling Mode

e.AGP— The AGP interfaceFigure 8-11 U3 Memory Configuration Tab Figure 8-12 shows the second of U3’s two configuration tabs, the API configuration

Page 142

2.Divider PopUp— This is the same as the Divider PopUp on the memory tab.Figure 8-12 U3 API Configuration TabU4 (Kodiak) North BridgeThis section de

Page 143

b.Write— Only store requests to memory can increment the counter.c.Read — Only load requests from memory can increment the counter.d.Any — All memory

Page 144

Figure 8-14 shows the second of U4’s two configuration tabs, the API configuration panel. As with the memorytab, the first line of each PMC’s controls

Page 145 - Symbol Lookup

ARM11 CPU Performance Counter ConfigurationThis section describes how you can make custom configurations for iOS devices with ARM11 processors. Thesed

Page 146 - Manual Session Symbolication

1.Basic Statistics — This section of the pane contains basic information about the system at the timethe session was recorded. The system’s name, the

Page 147

Menu ReferenceThis section summarizes Shark’s commands, arranged by menu.SharkThis menu contains the usual application-menu commands.Where DescribedDe

Page 148

Where DescribedDescriptionShortcutCommandClose the frontmost window. If thefrontmost window is the main controlwindow, this will quit Shark.Cmd-WClose

Page 149

DescriptionShortcutCommandRedo the next action.Shift-Cmd-ZRedoCut the selected text, placing it on the clipboard.Cmd-XCutCopy the selected text to the

Page 150 - Managing Sessions

FormatAll items in this menu are standard text processing commands. Since it is generally not possible to apply customformats to most text within Shar

Page 151 - Data Mining

Where DescribedDescriptionShortcutCommandMini ConfigurationEditors (page 19)Show/Hide the mini configeditor attached to the maincontrol window.Shift-C

Page 152 - Callstack Data Mining

Where DescribedDescriptionShortcutCommandNetwork/iPhoneProfiling (page 138)Enable Network Profiling of othercomputers or iPhones, instead of localprof

Page 153 - Figure 6-6 Heavy View

WindowAlong with standard window control functionality, this contains the command to show or hide the AdvancedSettings drawer on the right side of eac

Page 154 - Figure 6-7 Tree View

MenuWhere DescribedDescriptionShortcutCommandSamplingBatch Mode (page125)Toggles Batchmode, allowingthe recording ofmultiple sessionsbefore analysisbe

Page 155

MenuWhere DescribedDescriptionShortcutCommandDataMiningData Mining (page151)Just hide theselectedlibrary(ies),without addingtime to thecallers.Shift-C

Page 156

MenuWhere DescribedDescriptionShortcutCommandFileSession Files (page21)Attach a copy ofthe frontmostsession to a newemail in yourdefault emailprogram.

Page 157 - Perf Count Data Mining

AdvancedSettings menu item (Command-Shift-M). An example is depicted below in Main Window. The controlspresented will vary depending upon the current

Page 158 - A Performance Problem

MenuWhere DescribedDescriptionShortcutCommandDataMiningData Mining (page151)Hide all callstackswhich contain theselectedsymbol(s).Cmd-KRemoveCallstack

Page 159 - Example Shapes

MenuWhere DescribedDescriptionShortcutCommandConfigMini ConfigurationEditors (page 19)Show/Hide themini config editorattached to themain controlwindow

Page 160 - Taking Samples

Code Analysis with the G5 (PPC970) ModelShark offers several features designed to help the programmer understand instruction execution behavior onthe

Page 161 - Figure 6-14

note that the data in the G5 Resource Utilization drawer is based on the currently selected instructions in theCode Table, or on the entire code seque

Page 162 - High Level Analysis

any timer interrupts that occur in it are not serviced until interrupts are reenabled in ml_restore(). It is forthis reason that all of the timer samp

Page 163 - Figure 6-16

A more accurate picture of the kernel behavior can be seen with event sampling (Figure B-3). This is becauseCPU event sampling reads the SIAR (sampled

Page 164 - Figure 6-17

Intel’s Core processors have 2 performance counters per core. Both are programmable, and can count 111 (#1)or 112 (#2) different types of events.Most

Page 165 - Source View: NSObject

Valid Event-MaskBitsPMCNumberEvent NumberPerformance Counter Event Namenone1,2147BR_CALL_MISSP_EXECnone1,2139BR_CND_EXECnone1,2140BR_CND_MISSP_EXECnon

Page 166 - Figure 6-19

Valid Event-MaskBitsPMCNumberEvent NumberPerformance Counter Event Name61,2101BUS_TRAN_BRD5 6 71,2110BUS_TRAN_BURST5 6 71,2109BUS_TRAN_DEF61,2104BUS_T

Page 167 - Figure 6-20

Valid Event-MaskBitsPMCNumberEvent NumberPerformance Counter Event Namenone1,2215EMON_ESP_UOPS0 11,2218EMON_FUSED_UOPS_RET0 11,27EMON_KNI_PREF_DISPATC

Page 168 - Introduction To Focusing

3.Alternating/Solid Table Background— For tabular session window views, such as the profile browsersand code browsers described in Profile Browser (pa

Page 169 - Figure 6-23

Valid Event-MaskBitsPMCNumberEvent NumberPerformance Counter Event Namenone1,2192INST_RETIREDnone1,2133ITLB_MISS0 1 2 31,264L1_CACHEABLE_DATA_READS0 1

Page 170 - After focus and expansion

Valid Event-MaskBitsPMCNumberEvent NumberPerformance Counter Event Namenone1,2176MMX_INSTR_EXEC0 1 2 3 4 51,2179MMX_INSTR_TYPE_EXECnone1,2177MMX_SAT_I

Page 171 - Figure 6-25

Intel’s Core 2 processors have 5 performance counters per core. Two of these are fully programmable, and cancount 116 (#1) or 115 (#2) different types

Page 172 - Figure 6-26

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name02none1146BR_CALL_EXEC02none1147BR_CALL_MISSP_EXEC02none1139BR_CND_EXEC02none

Page 173 - Dig Deeper by Charging Costs

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name02none1145BR_RET_BAC_MISSP_EXEC02none1143BR_RET_EXEC02none1144BR_RET_MISSP_EX

Page 174 - Figure 6-29

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name5 6 7199BUS_LOCK_CLOCKS (Core and BusAgents masks apply)0 6 724 5 6 7196BUS_R

Page 175 - Figure 6-31

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name5 6 71106BUS_TRANS_PWR0 725 6 71103BUS_TRANS_WB026 71125BUSQ_EMPTY0 720 1 6 7

Page 176

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name020 1 3 51119EXT_SNOOP020217FP_ASSISTnone116FP_COMP_OPS_EXE0 11204FP_MMX_TRAN

Page 177 - Figure 6-32

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name020 1 2 3 4166L1D_CACHE_LOCK020 1 2 3165L1D_CACHE_ST02none171L1D_M_EVICT02non

Page 178

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name020 1 2 3 6 7140L2_IFETCH020 1 2 3 4 5 6 7141L2_LD0 6 724 5 6 7136L2_LINES_IN

Page 179 - Chart View

3.Remain in Background — Shark normally brings itself to the front when sampling completes. Thismeans that it will be the main application while it an

Page 180 - Place to Select

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name02none176LOAD_HIT_PRE0 2 320 21195MACHINE_NUKES0 2 3 4231170MACRO_INSTS.CISC_

Page 181

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name020 1 2 31213SEG_REG_RENAMES0 2 3 4 520 1 2 31212SEG_RENAME_STALLS02none16SEG

Page 182 - Figure 6-36

Valid Event-Mask BitsPMC NumberEvent NumberPerformance Counter Event Name0 720 117SSE_PRE_EXEC0 2 3 520 1175SSE_PRE_MISS020 1 314STORES BLOCKED026 715

Page 183

The PowerPC 750 (G3) cores contain four independent performance counters, each of which can count 12–17different types of events. Four commonly measur

Page 184 - The Config Editor

Event NumberPMC Number(s)Performance Counter Event Name91, 2Instr Bkpt Matches21, 2, 3, 4Instr Completed41, 2, 3, 4Instr Dispatched81, 2Instr Fetches1

Page 185

The PowerPC 7400 (G4) cores contain four independent performance counters, each of which can count 27–48different types of events. Four commonly measu

Page 186 - Figure 7-2 Config Editor

Event NumberPMC Number(s)Performance Counter Event Name153Branch Unit LR/CTR Stall Cycles371Branch Unit Speculative Load Stall Cycles131Branch Unit Sp

Page 187 - Figure 7-3

Event NumberPMC Number(s)Performance Counter Event Name183dL1 Cycles241dL1 Hits221dL1 Load Hits152dL1 Load Misses111dL1 Miss Cycles > Threshold172d

Page 188 - Figure 7-4

Event NumberPMC Number(s)Performance Counter Event Name51EIEIO Instr421External Snoop Requests52Fall through Branches113Floating Point Instr213Full Ca

Page 189

Event NumberPMC Number(s)Performance Counter Event Name361L2 Allocations441L2 Castout Snoop Hits292L2 Sectors Castout273L2 Snoop Hits133L2 Snoop Inter

Page 190

1.Ask About Unsaved Sessions— With Shark, you can optionally disable the usual behavior of asking ifyou want to individually save each session file wh

Page 191

Event NumberPMC Number(s)Performance Counter Event Name114SYNC Instr142System Register Unit Instr31, 2, 3, 4TimeBase (Lower) 0->1 bit transitions40

Page 192

The PowerPC 7450 (G4+) cores contain six independent performance counters, each of which can count 20–94different types of events. CPU cycles can be m

Page 193

Event NumberPMC Number(s)Performance Counter Event Name181, 2AltiVec MFVSCR Instr Sync Cycles131, 2, 4AltiVec MTVRSAVE Instr121, 2, 4AltiVec MTVSCR In

Page 194

Event NumberPMC Number(s)Performance Counter Event Name476Bus Retry from L1 Retry486Bus Retry from Prev-Adjacent426Bus TA's for Reads436Bus TA&ap

Page 195

Event NumberPMC Number(s)Performance Counter Event Name531dL1 Load Hits213dL1 Load Miss Cycles372dL1 Load Misses431dL1 Load-Miss Cycles > Threshold

Page 196 - Using the Editor

Event NumberPMC Number(s)Performance Counter Event Name183DTLB Misses234DTLB Search Cycles401DTLB Search Cycles > Threshold256DTQ Full351EIEIO Inst

Page 197

Event NumberPMC Number(s)Performance Counter Event Name911FPSCR Renames 1/2 Busy901FPSCR Renames 1/4 Busy921FPSCR Renames 3/4 Busy931FPSCR Renames All

Page 198

Event NumberPMC Number(s)Performance Counter Event Name196L1 External Interventions106L2 Castout Queue Full Cycles86L2 Castouts206L2 External Interven

Page 199

Event NumberPMC Number(s)Performance Counter Event Name145, 6L3 Touch Hits376176L3 Write Queue Full Cycles731LD/ST Alias vs. CSQ721LD/ST Alias vs. FSQ

Page 200 - Figure 7-13

Event NumberPMC Number(s)Performance Counter Event Name292LWARX Instr302MFSPR Instr284Mispredicted Branches361MTSPR Instr01, 2, 3, 4, 5, 6Nothing51, 2

Page 201

1.Source— Shark will usually find source files automatically if they are not moved between compilationand session viewing times. If you must move the

Page 202

Event NumberPMC Number(s)Performance Counter Event Name224Store String/Multi Pieces272STSWI/STSWX/STMW Instr153STWCX Instr254Successful STWCX Instr214

Page 203

Event NumberPMC Number(s)Performance Counter Event Name243VTE2 Line Fetches264VTE3 Line Fetches511Write-Through StoresPPC 7450 (G4+) Performance Count

Page 204

The PowerPC 970 (G5) cores contain an extremely sophisticated and complex set of performance counters.Unlike the other processors used in Macintoshes,

Page 205

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName1: TTM00: FPU30[FPU] fp0 estimate + fp1estimate1: TTM00: FPU3, 4, 7

Page 206 - Counter Control

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: TTM00: FPU1, 2, 5, 623[FPU] fp1 add, mult, sub,compare, fsel2: T

Page 207 - Process Marking

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: TTM11: GPS1, 2, 5, 630[GPS] Cacheable store queue full1: TTM11:

Page 208 - Figure 8-3 Process Marker

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: TTM11: GPS50[GPS] L2 miss on store access (R,S, I) + I=1 store o

Page 209

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName1: TTM11: GPS3, 4, 7, 821[GPS] Master L2 read transactionon bus was

Page 210

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName3: TTM11: GPS3, 4, 7, 825[GPS] Snoop state machinedispatched3: TTM1

Page 211

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName3: TTM00: IFU432[IFU] cycles i L1 write active +nothing2: TTM00: IF

Page 212

The first and most frequently used Shark configuration is the Time Profile. This produces a statistical samplingof the program’s or system’s execution

Page 213

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: TTM00: ISU10[ISU] completion table full + crmapper full0: TTM11:

Page 214

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName3: TTM00: ISU432[ISU] duration MSR(EE) = 0 +MSR(EE)=0 and interrupt

Page 215

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName3: TTM00: ISU332[ISU] fx0 produced a result + fx1produced a result3

Page 216

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: TTM11: ISU2: TTM00: ISU632[ISU] instructions dispatchedcount + g

Page 217

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: TTM11: ISU0: LSU01, 2, 5, 618[LSU0] d erat miss side 00: LSU060[

Page 218

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName1: LSU030[LSU0] marked flush from LRQshl, lhl side 0 + marked flush

Page 219

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: LSU0532[LSU0] marked L1 d cache storemiss + larx executed 02: LS

Page 220

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName3: LSU1 6|70: LSU13: LSU1 2|360[LSU1] flush from LRQ shl,lhl side0

Page 221

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: LSU13: LSU1 2|31, 2, 5, 616[LSU1] flush unaligned load side03: L

Page 222 - U1.5/U2 North Bridges

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: LSU13: LSU1 2|31, 2, 5, 621[LSU1] flush unaligned store side13:

Page 223 - U3 North Bridge

System Tracing 63Tracing Methodology 63Basic Usage 64Interpreting Sessions 66Summary View In-depth 67Trace View In-depth 73Timeline View In-depth 77Si

Page 224

as taking an entire time quantum balances out the numerous times that it is missed entirely, providing a fairlyaccurate measurement of the time spent

Page 225

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName1: LSU13: LSU1 2|340[LSU1] L1 d cache store miss +L1 dcache entries

Page 226 - U4 (Kodiak) North Bridge

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName3: LSU1 2|73: LSU1 6|33: LSU1 6|71: LSU13: LSU1 2|33, 4, 7, 816[LSU

Page 227

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName1: LSU13: LSU1 2|33, 4, 7, 821[LSU1] L1 dcache store side 13: LSU1

Page 228

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName3: LSU13: LSU1 2|7832[LSU1] L1 reload data source +Marked L1 reload

Page 229

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName3: LSU13: LSU1 2|33, 4, 7, 830[LSU1] LMQ slot 0 allocated3: LSU1 6|

Page 230 - Command Reference

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: LSU13: LSU1 6|31, 2, 5, 630[LSU1] LS1 reject - reload cdf ortag

Page 231 - DescriptionShortcutCommand

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: LSU13: LSU1 2|31, 2, 5, 624[LSU1] SRQ store forwarding side03: L

Page 232

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: TTM00: VMX1, 2, 629[VMX] forwarding occurred fromperm or alu or

Page 233

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName45CPU Marked Instruction finish51Dispatch Successes3: LSU117dL2 Hit

Page 234 - Sampling

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: TTM00: FPU16Instr Src Encode 0 (Lane 2 notset to IFU)0: ISU0: VM

Page 235

sampling mechanism are spread out to affect most areas of measured execution more or less equally. Incontrast, most event counting-based mechanisms, s

Page 236 - Alphabetical Reference

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: ISU0: IFU0: VMX2: LSU10: FPU0: ISU0: IFU0: VMX2: TTM00: FPU36Ins

Page 237

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: TTM00: FPU46Instr Src Encode 3 (Lane 2 notset to IFU)0: ISU0: VM

Page 238

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: ISU0: IFU0: VMX2: LSU10: FPU0: ISU0: IFU0: VMX2: TTM00: FPU66Ins

Page 239

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName2: TTM00: FPU76Instr Src Encode 6 (Lane 2 notset to IFU)0: ISU0: VM

Page 240

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName0: ISU0: IFU0: VMX2: LSU10: FPU0: ISU0: IFU0: VMX11Instructions Com

Page 241

Byte LaneNumberTTM MuxNumberPMCNumber(s)EventNumber(s)Performance Counter EventName410Overflow from PMC3510Overflow from PMC4610Overflow from PMC5710O

Page 242 - Miscellaneous Topics

The U1.5 and U2 North bridge chipsets contain four independent counters, each of which can count any oneof 55 different types of events.The table list

Page 243

Event NumberPerformance Counter Event Name72Burst Read Reqs [Bus]73Burst Write Reqs [Bus]65Burst Xacts [Bus]91Cache Inhib. Xacts [Bus]94Cycles Addr Bu

Page 244 - 0x96da8 (see Figure

Event NumberPerformance Counter Event Name98Read Prefetch Ops [Mem]69Read Xacts [Bus]97Retries on Maxbus [Bus]86Single Beat Mem Reads [Bus]75Single Be

Page 245

The U3 North bridge chipsets contain two distinct sets of counters.The first set of counters counts memory events, in a manner similar to the counters

Page 246 - Valid Event-Mask

5.Sample Limit — The maximum number of samples to record. Specifying a maximum of N samples willresult in at most N samples being taken, even on a mul

Page 247

Event NumberAPI Performance Counter Event Name0x00API Cycles0xFFNothing0x03Queue Reservations0x01Queue Transactions0x05Retries0x04Transaction Size (by

Page 248

Source NumberAPI Event Source Name0x200Master Tag: API00x400Master Tag: API0 and API10x300Master Tag: API10xA00Master Tag: HT0x900Master Tag: PCI0x800

Page 249

Source NumberAPI Event Source Name0x00Synchronization Queue0x15Vsp Coh Rd Rq Queue0xA0Vsp Rd Data Queue0x0FVsp Response Queue0x0AVsp Target Rq Queue0x

Page 250

The U4/Kodiak North bridge chipsets contain two distinct sets of counters.The first set of counters counts memory events, in a manner similar to the c

Page 251

Event NumberMemory Performance Counter Event Name83Issued transfer size (accumulate events, no filters)97Non-coherent read request [RT #24253] (count

Page 252

Source NumberAPI Event Source Name0x28API Wt Data Buffer0x10Bypass Queue0x01Command Slot0x27GCR Rd Data Queue0x0BGCR Response Queue0x08GCR Target Rq Q

Page 253

Source NumberAPI Event Source Name0x0DPCIE Coh Wt Rq Queue0x25PCIE Rd Data Queue0x05PCIE Rd Target Rq Queue0x09PCIE Response Queue0x21PCIE Wt Data Que

Page 254

The ARM11 cores used in iOS devices contain three independent performance counters. The first counter cancount only cycle counts, while the other two

Page 255

Event NumberPMC Number(s)Performance Counter Event Name152-3Main TLB miss352-3Procedure call instruction executed382-3Procedure return instruction exe

Page 256

This table describes the changes to Shark User Guide .NotesDateTBD2008-04-14New document that explains how to analyze code performance by profilingthe

Page 257

menu (#8), if you would rather see the “Tree” view, which is described in Tree View (page 36) and organizesthe sample groups according to the program’

Page 258

Apple Inc.Copyright © 2012 Apple Inc.All rights reserved.No part of this publication may be reproduced,stored in a retrieval system, or transmitted, i

Page 259

TheEditFindFind command(Command-F )andtherelatedEditFindFindNext (Command-G )andEditFindFindPrevious (Command-Shift-G) commands are very useful

Page 260

e.Symbol— The symbol where this sample was located. Most of the time, this is the name of the functionor subroutine that was executing when the sample

Page 261

6.Process Popup Menu— This lists all of the sampled processes, in order of descending number of samplesin the profile, plus an “All” option at the top

Page 262

The “Tree” view gives you an overall picture of the program calling structure. In the sample profile (Figure 2-8),the top-level function is [CelestiaO

Page 263

Note on Heavy/Tree comparisons: Please note that there may not be a one-to-one correspondencebetween entries in “Tree” view and “Heavy” view. If you

Page 264

deep callstacks being over-represented in the profile, since they are counted many times, but makes iteasier to find symbols for frequently-occurring

Page 265

Network/iPhone Profiling 138Using Shared Profiling Mode 141Mac OS X Firewall Considerations 143Advanced Session Management and Data Mining 145Automati

Page 266

Chart ViewClick Shark’s Chart tab to explore sample data chronologically, from either a thread- or CPU-based perspective.This can help you understand

Page 267

1.Callstack Chart— This chart displays the depth (y-axis) of the callstack for each sample, chronologicallyfrom left-to-right over time (x-axis). The

Page 268

6.Callstack Table— This displays the functions within the callstack for the currently selected sample, withthe leaf function at the top and the base o

Page 269

13.View Popup Menu— This popup lets you choose to view sets of samples from different processor cores.Advanced Chart View SettingsThe first pane of th

Page 270

7.Color Selection— Choose colors to use for user sample callstacks, kernel sample callstacks, and theselection area by clicking on these color wells.F

Page 271

Code BrowserDouble-clicking on an entry in the Results Table or Callstack Table will open a Code Browser view for that entry,as shown in Figure 2-12.

Page 272

2.Browse Buttons— You can use these buttons to maneuver through function calls. After you double-clickon a function call (denoted by blue text) and go

Page 273

b.Total — This optional column lists the percentage of displayed references for each instruction orsource line, including called functions. To see sam

Page 274

9.Source File Popup Menu—A given memory range can contain source code from more than one filebecause of inlining done by the compiler. You can select

Page 275

a.Address Column — This displays the address of the assembly-language instruction displayed on thisrow. With PowerPC, this value simply increases by 4

Page 276

Hardware Counter Configuration 202Configuring the Sampling Technique: The Sampling Tab 202Common Elements in Performance Counter Configuration Tabs 20

Page 277

4.Asm Help Button— Press this button to get help for the selected assembly-language instruction, asdescribed in ISA Reference Window (page 54).Figure

Page 278

6.Show Self Column— Toggles display of the column that lists the percentage of displayed referencesfor each instruction or source line, but not includ

Page 279

5.Show G5 (PPC970) Details Drawer— (PowerPC-only) Shark can display graphs of instruction dispatchslot and functional unit utilization in an additiona

Page 280

Figure 2-15 Advanced Settings for the Code BrowserOther architectures have slightly different options for items 3–5 of the Asm Browser

Page 281 - 511Write-Through Stores

●Syntax— Chooses whether to display the x86 instructions in Intel assembler syntax or AT&T syntax (thedefault). ●Show Prefixes— If checked, instr

Page 282

The ISA Reference Window provides an indexed, searchable interface to the PowerPC, IA-32 (32-bit x86), orEM64T (64-bit x86) instruction sets. The refe

Page 283

Tips and TricksThis section points out a few things that you might see while looking at a Time Profile , what they may mean,and how to optimize your c

Page 284

●Chart View ●Different parts of the chart look visibly different:Different-looking areas were probably created by different code in your program as i

Page 285

Shark. Please note that in Xcode you will need to adjust the build settings for the Target that you aretesting and the correct (optimized) build confi

Page 286

After compiling and running the reference decoder, Shark generated the session displayed in Figure 2-19. Justby pressing the “Start” and “Stop” button

Page 287

PPC 7450 (G4+) Performance Counter Event List 271PPC 970 (G5) Performance Counter Event List 282UniNorth-2 (U1.5/2) Performance Counter Event List 316

Page 288

VectorizationOptimizing the Reference_IDCT() function by converting it from floating point to integer also presentedanother possible optimization that

Page 289

Add_Block()), colorspace conversion (dither()), and pixel interpolation (conv420to422() andconv422to444()) achieved a speedup of 5.69x over the origin

Page 290

SpeedupOptimization Step1.12xFast floor()1.86xInteger IDCT2.05xVector IDCT5.69xAll VectorTime ProfilingExample: Optimizing MPEG-2 using Time ProfilesR

Page 291

Shark’s System Trace configuration records an exact trace of system-level events, such as system calls, threadscheduling decisions, interrupts, and vi

Page 292

and multithreading problems, because these issues frequently hinge upon managing the precise timing ofinteraction events properly in order to minimize

Page 293

●Start Time ●Stop Time ●A backtrace of the user-space function calls (callstack) associated with each event ●Additional data customized depending on

Page 294

Out of memory errors?: If you see these when starting a system trace, then just reduce the SampleLimit value until Shark is able to successfully allo

Page 295

Summary View In-depthThe Summary View is the starting point for most types of analysis, and is shown in Figure 3-3. Its most salientfeature is a pie c

Page 296

Scheduler SummaryThe Scheduler Summary tab, shown in Figure 3-4, summarizes the overall scheduling behavior of the threadsrunning in the system during

Page 297

Note on Thread IDs: Thread IDs on Mac OS X are not necessarily unique across the duration of aSystem Trace Session. The Thread IDs reported by the ke

Page 298

Figures, Tables, and ListingsGetting Started with Shark 17Figure 1-1 Main Window 17Figure 1-2 Process Target 18Figure 1-3 Mini Configuration Editor 19

Page 299

Note on System Trace callstacks: In rare cases, it is not possible for System Trace to accuratelydetermine the user callstack for the currently activ

Page 300

More settings for modifying this display are available in the Advanced Settings drawer, and are described inSummary View Advanced Settings (page 71).F

Page 301

4.Callstack Data Mining— The System Call and VM Fault summaries support Shark’s data mining options,described in Data Mining (page 151), which can als

Page 302

Trace View In-depthThe Trace View lists all of the events that occurred in the currently selected scope. Because events are mostcommonly viewed with “

Page 303

●Reason— Reason that the thread tenure ended (described in Thread Run Intervals (page 79)) ●Priority— Dynamic scheduling priority of the threadFigure

Page 304

occurred. Otherwise, the beginning and ending thread interval indices are listed. Because it is possible foran event to start before the beginning of

Page 305

You can toggle the display of the Callstack Table , which displays the user callstack for the currently selectedVM fault entry, by clicking the button

Page 306

●Size— Number of bytes affected by the fault, an integral multiple of the 4096-byte system page sizeFigure 3-10 Trace View: VM FaultsTimeline View In

Page 307

●Keyboard Navigation— After highlighting a Thread Run Interval by clicking on it, the Left or Right Arrowkeys will take you to the previous or next r

Page 308

Thread Run IntervalsEach time interval that a thread is actively running on a CPU is a thread run interval. Thread run intervals aredepicted as solid

Page 309

System Tracing 63Figure 3-1 Time Profile vs. System Trace Comparison 64Figure 3-2 System Trace Mini Config Editor 65Figure 3-3 Summary View 67Figure 3

Page 310

There are five basic reasons a thread will be switched out by the system to run another thread:Blocked— The thread is waiting on a resource and has vo

Page 311

MIG Message— Mach interface generator routines, which are usually only used within the kernelFigure 3-14 Timeline View: System CallsCalls from all of

Page 312

●Arguments— The first four integer argumentsFigure 3-15 System Call InspectorVM FaultsAs is the case with almost all modern operating systems, Mac OS

Page 313

Non-Zero Fill— A previously unused page not marked “zero fill on demand” was touched for thefirst time. Generally, this is only used in situations whe

Page 314

Three of these types of faults are visible in Figure 3-16. A zero-fill fault is circled to highlight it. Clicking on aVM Fault Icon will bring up the

Page 315

Clicking on an Interrupt icon will bring up the Interrupt Inspector. This inspector lists the amount of time theinterrupt consumed, broken down by CPU

Page 316

the amount of time spent on the CPU and time spent Waiting between the begin and end event. Since youcan supply different arguments to the start and e

Page 317

4.Draw Context Switch Lines— Check this to enable (default) or disable the thin gray lines that show contextswitches, linking the thread tenures befor

Page 318

6.Label Events— These checkboxes allow you to enable or disable the display of event icons either entirely,by type group, or on an individual, type-by

Page 319

Figure 3-20 Timeline View Advanced Settings DrawerSystem TracingInterpreting SessionsRetired Document | 2012-07-23 | Copyright © 2012 Appl

Page 320

Figure 4-14 Chart View with additional timed counter graphs 121Advanced Profiling Control 122Figure 5-1 Process Attach 122Figure 5-2 Launch Process Pa

Page 321

Sign PostsEven with all of the system-level instrumentation already included in Mac OS X, you may sometimes find thatit is helpful or even necessary t

Page 322

●User Applications using CHUD Framework: User Applications that link with the CHUD.framework, andcan simply call chudRecordSignPost(), which has the

Page 323

Listing 3-2 signPostExample.c#include <CHUD/CHUD.h>#include <stdint.h>/* This corresponds to the sign post defined above, LoopTimer */#def

Page 324

/** Use the kernel_debug() method when in the kernel (arg5 is unused),* DBG_FUNC_START corresponds to chudBeginIntervalSignPost.*/kernel_debug(APPS_DE

Page 325

It can also indicate that your threads frequently block while waiting for locks. In this case, it is possiblethat the short intervals are inherent to

Page 326

●Multi-threaded application only has only one thread running at a time:First of all, ensure you’ve performed the System Trace on a multiprocessor mac

Page 327

Another possibility is that you’ve simply not given your worker threads enough work to do. Verify thistheory using the tip from the summary view sugge

Page 328

Not every performance problem stems from computation in a program or a program’s interaction with theoperating system. For these other types of proble

Page 329 - Document Revision History

5.Prefer User Callstacks— When enabled, Shark will ignore and discard any samples from threads runningexclusively in the kernel. This can eliminate sp

Page 330

showing you how much time your threads are blocked and how often they are running. As a result, it is a good“sanity check” technique to make sure that

Commentaires sur ces manuels

Pas de commentaire