frequently asked questions on lrdev
contents:
C/C++
/
Java
/
assembly
/
patterns
/
Unix
/
math
/
wine
/
miscellaneous
/
ask
C/C++ programming languages
-
What is ISO/IEC 9899:2011?
ISO/IEC 9899:1999?
ISO/IEC 9899:1990?
Where can I find ISO/IEC 9899?
-
The ISO/IEC 9899:2011 standard defines
the C programming language.
Older versions of this standard are known as ISO/IEC 9899:1999, ISO/IEC 9899:1990,
ANSI/ISO 9899-1990, ANSI-C or K&R2
[The C Programming Language,
Brian W. Kernighan, Dennis M. Ritchie,
Prentice Hall,
2nd Ed. 1988,
ISBN 0-13-110362-8].
ISO standards are not free.
They can be bought
[online, as PDF, too]
at
ISO.
[
info on C standards
/
ISO/IEC 9899:2011 order sheet
/
C Wikipedia page
/
an ancient entry point for C programming language information
/
C programming language coding guidelines with background information
]
-
What is ISO/IEC 14882:2011?
ISO/IEC 14882:2003?
Where can I find it?
-
This ISO/IEC standard defines
the C++ programming language.
The older ISO/IEC 14882:1998 / ISO/IEC 14882:2003
is also described in
[The C++ Programming Language,
Bjarne Stroustrup,
Addison-Wesley,
3rd Ed. 1997,
ISBN 0-201-88954-4].
See
question and comments above.
-
What's the C programming language definition?
C programming language reference?
What's the C++ programming language definition?
C++ programming language reference?
-
See the
two questions above
[for books and ISO standards].
-
Do you feature C coding guidelines?
C coding standards?
Do's and don't's in C?
C coding styles?
-
Yes, see
C programming language coding guidelines.
(TOC:
C language definition
Internationalization
Coding character set
Coding language
Identifier names
Name styles
Name restrictions
Namespace
File names
Compiler warnings
Style warnings
Helper tools
Lint
Metrics
Assertions
Runtime checkers
Code complexity
Nested expressions
Redundancy
Determinism
Modularity
Interfaces
Header files
Resources
Code order
Conditional compiling
Code nesting
Scope
Scope of functions
Scope of variables
Scope of types
Scope of macros
Error prone constructs
Explicit casts
Type size
Array size
Buffer sizes
Macro parameters
Macro side effects
Sign extension
Error checking
Sequence points
Optimizer errors
Style
Numbers
Unsigned numbers
Longs, shorts
Floats
Parameter types
Variable arguments
Portability types
Standard library
NULL macro
Register
Auto
Goto
Multiple returns
Obscurities
Topics left out
Indentation
Tabulators
Braces
Labels
Blanks
Comments
Block comments
)
-
Do you feature C language program samples?
Did you implement projects using the C language?
-
Yes, please see the lrdev
software page
for some C programming language (among other programming languages)
samples.
C++ samples / projects are also featured there.
-
Do you feature C programming language program tutorials?
Free courses in C programming?
-
No.
But there's a
short overview of the C programming language,
with some footnotes.
-
Do you feature the C programming language overview as PDF?
-
Yes, due to public demand;
my
C programming language overview
can be found
here
too,
as PDF
([Adobe Systems]
portable document format)
file
[it's
6.8 times larger,
(both)
compressed
even
19.5 times larger,
though].
-
What about the MISRA C coding guidelines?
-
Some of the MISRA rules,
sometimes applied in embedded programming,
are of general use,
most of them mentioned in
my
C coding guidelines,
such as:
conform to the ISO 9899 standard
(#1/#9/#20/#43/#52/#71/#72/#75/#76/#78/#79/#85;
#1.1/#1.2/#2.2/#2.3/#4.1/#8.2/#8.3/#10.1/#10.2/#13.1/#14.1/#16.5/#16.6/#16.8/#19.8),
do checks
/
be robust
(#4/#86;
#16.10),
use portability types
(#13;
#6.3),
scope use
/
minimized scope
(#22/#23;
#8.7/#8.11),
commonly included header files
(#26/#27;
#8.1/#8.8),
determinism
(#30;
#9.1),
use const modifier
(#81;
#16.7),
follow
macro guidelines
(#96),
don't use
atoi/atol/atof
(#125;
#20.10).
Some are intended for embedded platforms
(restricted environments)
only,
such as:
don't use dynamic/heap memory allocation
(#118;
#20.4),
don't use signal handling
(#123;
#20.8),
don't use stdio
(#124;
#20.9),
don't use exit/abort/getenv
(#126;
#20.11)
(actually the build environments for embedded systems should restrict the use of
signal.h, stdio.h, and prototypes for functions such as fopen, getenv,
based on chosen settings).
Some of the rules
restrict language use
for possible error avoidance through simplification
(addressing the possibility of introducing bugs by code semantics misunderstandings),
such as:
avoid octal number literals
(#19;
#7.1),
always use
parentheses within logical expressions
/
don't need to know (?)
(don't trust?)
operator
precedences
(#34/#47;
#12.1/#12.5),
avoid bit operations on
signed
types
(#37/#39;
#12.7),
explicit casts in mixed precision arithmetic
(#48),
no
gotos/breaks/continues
(#55/#56/#57/#58;
#14.4/#14.5/#14.6),
use
final empty else clause
/
no switch fallthroughs
/
final empty default case
(#60/#61/#62;
#14.10/#15.2/#15.3),
avoid
varargs
(#69;
#16.1),
no
recursion
(#70;
#16.2),
have
single
point of exit
(#82;
#14.7),
prefer function over macro
(#93;
#19.7),
avoid too many
levels of pointer indirections
(#102;
#17.5),
avoid
longjmp
(#122;
#20.7);
and some address style (and readability),
such as
not mixing C and assembler
(#3;
#2.1),
avoid comma operator
(#42;
#12.10),
test for zero explicitly
(#49;
#13.2),
use braces
(#59),
avoid pointer arithmetics
(#101).
-
What are the C programming language keywords?
-
auto break case char const continue default do double else enum extern
float for goto if int long register return short signed sizeof static
struct switch typedef union unsigned void volatile while
[BK/DR,1988,p.192,A2.4]
;
[9899:1999]
additionally defines
inline restrict
and
the
[partially optional]
types
_Bool
_Complex
_Imaginary
[9899:2011]
additionally defines
_Alignas
_Alignof
_Atomic
_Generic
_Noreturn
_Static_assert
_Thread_local
(often used in macros rather than directly)
[
questions on:
auto
(stack alignment),
const
(orthogonality to volatile
),
int
(string to int
),
sizeof
(orthogonality),
struct
,
typedef
,
union
,
unsigned
(often not needed),
volatile
(orthogonality to const
),
/
keywords vs. reserved words
]
-
What are the C++ programming language keywords?
-
alignas alignof
and and_eq asm auto bitand bitor bool break case catch char
char16_t char32_t
class compl
const const_cast
constexpr
continue
decltype
default delete do double dynamic_cast else
enum explicit export extern false float for friend goto if inline int
long mutable namespace new
noexcept
not not_eq
nullptr
operator or or_eq private
protected public register reinterpret_cast return short signed sizeof
static
static_assert
static_cast struct switch template this
thread_local
throw true try typedef
typeid typename union unsigned using virtual void volatile wchar_t while
xor xor_eq
[ISO/IEC 14882:2011].
and and_eq bitand bitor bool compl const_cast dynamic_cast
explicit export false mutable namespace not not_eq or
or_eq reinterpret_cast static_cast true typeid typename using wchar_t
xor xor_eq
were introduced with
[BS,1997]
(p.794, A.2).
alignas alignof char16_t char32_t constexpr decltype noexcept nullptr
static_assert thread_local
were introduced with
[ISO/IEC 14882:2011].
[
questions on:
auto
(stack alignment),
const
(orthogonality to volatile
),
int
(string to int
),
mutable
(vs. const
),
operator
(precedences),
sizeof
(orthogonality),
struct
,
typedef
,
typeid
,
union
,
unsigned
(often not needed),
virtual
,
volatile
(orthogonality to const
),
and/and_eq/bitand/bitor/compl/not/not_eq/or/or_eq/xor/xor_eq
,
]
-
What is the difference between programming language keywords and reserved words?
Reserved words in C language?
-
Reserved words
cannot be used as
identifiers,
whereas
keywords
signify special semantics
(e.g. flow control
[the Lisp programming language accordingly calls constructs
with those keywords
special forms]).
In a reasonable programming language
(such as C),
[all] the keywords are reserved words,
since that makes code easier to read
[and less error-prone,
and easier to
parse].
Fortran's keywords are not reserved words
[which makes parsing more difficult here
and introduces yet another source of errors];
Java's reserved word
goto
is not strictly a
keyword
[since it is only reserved but not used in that language;
however it is listed as a keyword
{to allow compilers issue more informative diagnostics and
minimize Java/C++ porting problems}].
The reserved words in the C programming language are
[as mentioned]
its keywords;
additionally some identifiers must not / should not be used,
e.g. such beginning with the underscore character
[reserved for the C system/implementation/library
(and used too with keywords later added,
such as _Noreturn)].
[
C keywords
/
C++ keywords
/
Java keywords
/
Turtle keywords
/
coding guidelines about identifier names
/
coding guidelines about name space
/
scope and identifier naming
]
-
What are digraphs?
Trigraphs?
What is bitand?
-
Digraphs
(
<% %> <: :> %:
meaning
{}[]#
)
and
trigraphs
(??= ??( ??< ??/ ??) ??> ??' ??! ??- ???
meaning
#[{\]}^|~?
)
support restricted character sets
(that are obviously missing one or more of the characters
#[\]^{|}~
).
C++ supports
keywords
(and and_eq bitand bitor compl not or or_eq xor xor_eq not_eq
meaning
&& &= & | ~ ! || |= ^ ^= !=
{more consistently [but less conveniently]
and_eq would be bitand_eq,
or_eq bitor_eq,
xor bitxor,
and
xor_eq bitxor_eq})
that can replace
!&^|~
.
[Found in
iso646.h.]
However, one cannot do without the special characters
"%'()*+,-./:;<=>?_
{unless with obfuscating practices}.
Some of the
ASCII characters
($@`
)
are not used in C/C++ programming languages.
[
digraph-hello-world.cc
/
trigraph-hello-world.c
]
-
What is iso646.h?
-
The
C
header file
iso646.h
defines alternatives for some of C's operators:
and &&
and_eq &=
bitand &
bitor |
compl ~
not !
not_eq !=
or ||
or_eq |=
xor ^
xor_eq ^=
[
digraphs
/
trigraphs
]
-
What are the C/C++ operator precedences?
What are operator precedences?
-
The C++ operator precedences can be found
here
and
here.
Since [in C/C++] all expressions establish a [r-]value
(including e.g. assignments),
expressions can be [deeply] nested and
it turned out useful to define many levels of
reasonable precedences
[to mostly suppress the need for parentheses in expressions].
The operators have an associativity as well
(e.g. the assignments: right to left).
In C++ (as opposed to C), the ternary operator seems to have its
precedence switched with the assignments'
(a typo?)
[you might want to use parentheses here, to be portable and
robust].
The bit operators have an unexpected low precedence for historic reason
[predecessor to the logical operators].
Precedences typically do not define
an order of evaluation,
as they do not establish sequence points
[a useful exception here are e.g. the logical operators].
[
operators in the C programming language overview
]
-
C operator precedence tester?
-
A test program for the precedences of
C binary operators can be found
here.
It's a code generator that produces
all combinations of two binary operators (both ways/directions)
and code to check the results of the expressions
against the result of such with appropriate
(and inappropriate) parentheses,
while filtering out tests that produce indifferent results.
The code generator itself is also coded in C
and is itself [quite] precedence independent
(with the help of [redundant] parentheses,
to avoid the bootstrap problem).
[
operator precedences
]
-
What kinds of operators does the C language define?
-
One can distinguish the following kinds of operators:
arithmetic operators,
bit operators,
logical operators,
relational operators,
assignment operators,
data selection operators,
referencing/dereferencing operators,
function call operator,
explicit conversion operator,
increment/decrement operators,
size of an expression/type operator,
ternary operator
and
sequence operator
(see
this overview).
[
operator precedences
]
-
What is the atoi syntax?
What is the ... syntax?
-
atoi's and other's syntax and synopsis can usually be found by
means of '
man atoi
' (Unix manual command [see
question on unix man
too]) or what's appropriate on your system.
[Unfortunately atoi does not have a dedicated error return
value (0 is also a valid non-error return value);
for that reason, the better
strtol,
which features an
(optional) additional reference parameter and a parameter denominating
the input base, should be used.]
[
integer conversion question
/
how to do it in C++
]
-
How is qsort used?
How to program qsort compare-functions?
-
extern void qsort(void* base, size_t count, size_t width, int (*compare)(void const* a, void const* b));
is called with an array of data to sort [base],
the number of elements to sort [count],
the size of a single element in bytes
[width; an array element may be a large structure or just a pointer],
a compare function [compare],
some of which might be expressed with sizeof
.
A typical qsort compare function [helper function] looks like
[T
being a data type, e.g. struct stat
]:
static int cmp_func(void const* a, void const* b)
{
T const* aa = (T const*)a;
T const* bb = (T const*)b;
if (aa->x != bb->x) {
return (aa->x > bb->x ? 1 : -1);
}
if (aa->y != bb->y) {
return (aa->y > bb->y ? 1 : -1);
} ...
return 0;
}
;
all data accesses are const,
the helper function can be static
[i.e. not globally visible],
the parameters are passed as generic [void] pointers
and are cast to the target type [while remaining const],
the most significant data element is compared first,
then, if it's equivalent, the next one, etc;
zero return means 'equivalent'.
If there is an additional level of indirection
(as with an array of pointers to data),
T const* aa = *(T const**)a;
,
etc is chosen.
Note that since
typical qsort implementations are not stable/preserving
[i.e. an existing order in groups of elements of equivalent sort key
may/will change],
the compare functions often compare up to a unique key
[e.g. when sorting directory entries by time or size:
sort by that and, if equal, by {unique} name].
[
what is quicksort
/
qsort's drawbacks
/
sort-up-to-a-unique-key pattern
]
-
What are qsort's drawbacks?
-
One drawback of
quicksort
implementations is that they are often/usually not stable
[i.e. a pre-existing order in elements (of equivalent sort key)
may/will change].
A drawback in
C's
qsort's
interface
is the missing user data
(instance data for the compare functions),
that is usually passed to callback functions,
so the comparison functions have to behave stateless.
-
What is strtok?
Strtok nesting?
-
strtok is a simple string tokenizer
[used for lexical analysis / parsing].
For several reasons it's often not the appropriate tool:
strtok is not reentrant
(therefore can't be
nested:
a second tokenizer instance would overwrite first's {global!} state)
[see
question on reentrancy]
(although some platforms [additionally] provide a reentrant version of
this function
{e.g. [Solaris']
strtok_r
}),
implements only limited patterns
and the buffer to be parsed is modified.
Typically it's more convenient [for a parser] to return the whole
[parsing-] state to the caller and use an end-index instead of [nul]
string termination
[lrctpz.zip:ctpzut.cc:SetParser::enumerate
(C++ sample)
implements
such a parser pattern,
with the help of a (simple) StringRange
helper class].
Java's
nio's
Buffer
,
with its 'viewed buffer'
[layering]
abstraction,
implements
this useful pattern
and
multiple indices
['capacity', 'limit', 'position', 'mark']
too.
-
What is the meaning of reentrant?
What are the problems with reentrancy?
-
A function may [should!] be reentrant, which means
that it may be invoked again (re-entered), while it is still
referenced by one of the program-flow contexts (stack frame,
thread context).
Synchronous reentrancy pitfalls include static state [i.e.
static data (static variables), or global data], which is overwritten by
multiple function invocations (either explicit, implicit [i.e. indirect;
often difficult to tell, see errno problem], or through
recursion).
Asynchronous reentrancy pitfall situations include multithreading and
signals [or interrupts, e.g. in interrupt-based embedded systems].
It doesn't usually matter if the underlying hardware is a
single-processor system or a multi-processor system;
however the chances of asynchronous problems occurring are typically
higher on multi-processor systems [due to their finer-granular
concurrency {not just interruptions by task-switches, only every
microsecond or so}].
Libraries may contain non-reentrant functions, that either
require to keep state [libc: e.g.
strtok],
or just are implemented impure [which may be considered a
bug].
[
reentrant lock
]
-
What is it about scope?
-
Scope is important.
In C there's application, file, function and block scope
that can be applied on functions, variables, types and macros.
Using optimal scope is a means
for [good] encapsulation [an OO-paradigm],
which makes code frameworks less complex to maintain.
Scope tends to influence [identifier] naming:
the smaller the scope, the less strict naming rules are.
C++ additionally knows of class member access modifiers
[
private, protected, public, friend
]
that control scope in a class relationship manner.
[
coding guidelines' scope chapter
/
Java's access modifiers
]
-
What is a typedef?
-
C's typedefs introduce type aliases
[used as
portability types].
They are typically not or only less used in C++
[the more powerful class construct is used there,
with user defined cast operators,
and possibly
polymorphism
(both of which can be used to express tight relationship)].
-
What is a typeid?
-
C++'s
typeid
operator returns a class
type_info
reference, which, as the name suggests, carries
data type information.
type_info
provides a printable name [as string]
and comparison operators [to check for type equality and make
type information data sortable].
type_info
is part of C++'s run time type information
(RTTI) support [as dynamic_cast
also is].
RTTI support instantiates several helper data
structures per class [which are usually linked into the classes'
virtual
tables] and sometimes requires extra compiler options.
-
What is the meaning of C++'s 'virtual' keyword?
Virtual table layout?
Virtual inheritance?
-
'virtual' means 'indirection'; either in method
invocation or in composite object access.
These indirections are needed for C++ to implement
polymorphism
and diamond-shaped inheritance.
See
this
for background information and some assembly dumps.
[Additional keywords:
vftable, vbtable, vmtable.]
-
Is the virtual table layout compiler specific?
-
Yes,
even compiler option specific.
However,
table layout will be similar
with other compilers or other compiler settings,
since there is one quite optimal way
(concerning code size and speed)
how to do it.
[
virtual
/
virtual tables
]
-
What are asserts?
Assertions?
-
Asserts are a compact means to make sure software conditions
are met, often to assure program-internal logic.
Asserts are more often used in C than C++ [since the latter offers an
exception handling system that allows compact error handling as
well {i.e. exceptions need not be handled at every
function-call level, and uncaught exceptions fall through to
the top (and have similar consequences there as the assert macro)}].
Often asserts are used to assure preconditions and postconditions, to
assure interface contracts.
Data [object] invariants are also candidates for asserts.
[
coding guidelines' assertions chapter
/
sample that asserts some internal logic
/
question about code robustness
]
-
How do I check for integer overflows?
-
C doesn't provide any mechanisms to catch integer operation
overflows.
Usually the problem domain (the used number range) should be
well known and input data sufficiently checked to
avoid integer overflows.
Input can be manually checked before or after operations, as e.g. done
in
strtol
implementations.
Most processors detect overflow conditions and on machine-level
provide opcodes to trap/interrupt (and such to [just] branch)
on these conditions
[e.g.
sparc:
tvs
(trap on overflow set)
and e.g. the {esoteric}
taddcctv
(tagged add and modify integer condition codes and trap on overflow);
x86:
into
(interrupt on overflow)].
The carry flag and the sign flag aren't directly
accessible from the C language either.
-
How do I check for buffer overflows?
-
Make sure buffer size information is handed through to
each function using the buffer.
Make sure it's used there.
E.g.: Don't use
'gets'
(use 'fgets' instead).
Do manual reviews that ensure these properties.
[Note that
Java
has its virtual machine checking for
buffer overflows
{which makes Java a more robust language}.]
[
coding guidelines' buffer sizes chapter
/
stack growing down leading to more severe buffer overflows
]
-
Can we overwrite functions in C?
Can we overload functions in C?
-
Function overwrite, i.e. re-implementing functions with
different signatures (parameter number and/or types), is
not supported in C.
Overloading can be done manually with structs and
function members
[see e.g.
OpenSSL source code].
-
How do I convert strings to integers?
How do I convert hexadecimal to binary?
Int to hex?
-
C [compiler] supports only decimal, hexadecimal and octal
numbers as number literals (in the notations
10
,
0x10
and 010
).
[As number literals typically shouldn't be used in code anyway
(see
here),
this restriction isn't too severe.]
A C programmer typically isn't concerned with a machine-internal number
representation [two's complement,
bcd,
endianess,
etc]
either
(lower level libraries are).
The conversion from [external] string representation to
internal number representation should be accomplished by
strtol
(which features an input base parameter and
an optional scan-end-position reference parameter [which can be
used for error checking and further scanning]; possible
overflow
is checked too [since int
s,
long
s, etc can only hold a limited number range]).
[Don't use scanf and atoi, for different reasons;
see
'atoi' question
too.]
Conversion from internal representation to a string
('printing') is usually done with sprintf
(better,
if available: snprintf
).
[s][n]printf
unfortunately offers only the d
,
x
and o
(decimal, hexadecimal and octal)
format specifiers, i.e. not all representable bases are
supported [unlike e.g. with Lisp, that offers the optional
:base
and :radix
parameters and the
~
nR
print-format; input can be
specified by #
nr
...
{e.g. #3r
...}, #x
...,
#o
... or {binary!} #b
...].
Numbers' string representations usually allow bases from
2 to 36 (inclusive) [using 10 digits and 26 latin letters].
[
coding guidelines' chapter on sign extension
/
how to convert to integer in C++
]
-
How do I convert strings to integers in C++?
-
As C++ is [mostly] a superset of C, the
C conversion methods
can be applied too.
However for [generalized] scanning and printing
C++ offers
the
type safe
istream/ostream operator>>/operator<<
mechanisms.
E.g. to convert from string,
use
istrstream s(input);
int i;
s >> i;
if (s.tellg() == 0) ...
[scan position advancing used to check for parse errors].
Note that the ios
class, from which istream
and ostream
[and istrstream,
etc] derive,
does not implement an input/output base,
but only some special bases [dec, hex, oct,
implemented as flags].
[
C conversion methods; additional background information
]
-
Why do I get this ominous ffffffe4 instead of e4 when printing a hexadecimal representation of a character?
Why does it read ZÿFFFFFCrich instead of Zürich in mails I receive?
-
There is a preliminary explicit cast missing
in the data type conversion from
char
to a larger numerical type
[e.g. to an int
;
see
here].
Often char
s are signed
and widening will
extend that sign, so [non-ASCII] characters above 0x7f will result in
negative int
s [with leading 1-bits {ffffff
prefix}];
e.g. a MIME-quoted-printable encoding '=FF
' in
this case may be interpreted as 'ÿ' [y-umlaut; in iso-latin-1];
an URL-encoding '%FF
' may lead to the same.
-
How do I calculate an integer's binary logarithm?
-
The binary logarithm
{lb(1) == 0, lb(2) == 1, lb(4) == 2, ...}
can either be calculated iteratively
(by dividing by two until zero is reached;
or comparing to powers of two
{
1<<0, 1<<1, 1<<2,
...})
[runtime O(n),
where n is the bit size of an integer {e.g. 32}],
or by a binary search
[through an array of the powers of two]
[runtime O(lb(n)), {e.g. lb(32) == 5}].
Taking advantage of the usual
two's complement integer representation,
one may scan for the highest bit
[runtime O(n)],
or apply bit patterns
[that implement a specialized binary search
{a binary binary search?;-)
}]
[e.g. on 32-bit platforms
bt(0xffff0000u);
bt(0xff00ff00u);
bt(0xf0f0f0f0u);
bt(0xccccccccu);
bt(0xaaaaaaaau);
where bt(m) is
i <<= 1;
if ((p = n & m) != 0) i |= 1, n = p;
]
[runtime O(lb(n))].
Note that many processors implement opcodes to do that,
such as bit-scan, highest-populated-bit, etc;
that however are not necessarily faster than the solution above.
[
ffff0000 and similar bit patterns
]
-
How to reverse bit patterns?
-
See
bitreverse.c
for implementation(s), comments and
x86
assembly.
-
Are there guidelines concerning stack usage?
Stack versus heap?
-
Unlike Java, C and C++ are able to instantiate a lot more data [objects]
on stack.
Typically data that has function-body lifetime should be
realized on the stack; data that has dynamic extent
(i.e. that is returned to calling functions [not by-value] and
referenced there)
must be allocated on heap.
The heap can be used too (the Java way)
if there exist [severe] stack size restrictions.
Static data should not be applied
[in that or other cases],
since that makes a function non-reentrant
[see
question on reentrancy].
alloca
shouldn't be used either
since it is not so
portable/robust
and tends to irritate debuggers.
[
stack based programming languages
/
stack alignment
/
pascal style stack clean-up
/
stack corruption
/
stack growing up or down
/
stacks in recursion
/
stack size in quicksort implementations
/
stack overflow with recursion
]
-
What about stack alignment?
C auto variables alignment (stack data alignment) vs. data segment variables alignment (global data alignment)?
-
Often the stack pointer is sufficiently aligned
for larger primitive types
(
double, long long, long double,
etc)
[this is of course especially the case for CISC machines,
which often don't really require data alignment
{however run faster with it}],
so no additional steps need to be taken
besides subtracting the required number of bytes.
If not:
the stack frame may be aligned by rounding it down
[e.g. {x86} esp &= ~0xf
]
(in the case of stack growing down);
this typically requires saving its old contents
[e.g. in {x86} ebp
].
Global data is usually rather generously (i.e. not less) aligned,
compared to stack data.
[This issue is handled by compilers,
not the C programmer,
however worth some consideration.]
-
Can a goto statement corrupt the stack?
-
No.
However it allows to skip over variable initializations which
may make a problem look like stack corruption.
longjmp
can leave corrupted stacks
[if used inappropriately].
An
overflow
of a local buffer is the usual reason for stack corruption.
-
Can the stack leak memory?
-
No.
Neither does stack usage fragment memory.
However,
excessive
recursion
[with a
stack
overflow]
can be considered 'stack-leaking'.
-
How to write a function that shows whether the stack is growing up or down?
-
First:
You don't want to know;
it usually doesn't matter.
Second:
Step in a debugger;
the register display will indicate
stack growth
on push/pop, call/return and similar operations.
Third:
Call a function with references (pointers)
at auto variables from differently nested stack frames and
compare the pointer addresses;
use the usual precautions
[use function's input and output,
so nothing gets optimized away,
encapsulate in distinct compilation units,
so the optimizer doesn't have the overall picture
{since memory locations of unrelated data objects
are not required to fulfill certain semantics}].
Sample:
stackgrowth.tar.gz.
A stack growing down may lead to
[from the security point of view]
more severe
buffer overflow
problems
[since it can overwrite return addresses and
so easily manipulate program flow].
-
What is byte order?
-
There are different ways how to map
a multi-byte primitive
to memory, disk or network.
The most usual mappings
(out of all [possible] permutations
[e.g. totally 24 with 32-bit/4-byte integers])
are little endian byte order
(encoding in ascending significance,
least-significant byte is encoded first:
0 1 2 3)
and big endian byte order
(encoding in descending significance,
most-significant byte is encoded first:
3 2 1 0).
Note that most of the other permutations are not seen at all;
however one-bytes 'groups' can be
handled differently from two-bytes groups, etc.
(1 0 3 2 and 2 3 0 1
in the 32bit case,
sometimes called 'mixed endian'),
this is however rarely seen.
Some CPUs (e.g. MIPS) can serve different endianesses
(to enable compatibility to different operating systems),
however usually set only at boot time.
[
network byte order
/
byte order conversion
/
byte order indicator
]
-
What is network-byte-order?
-
It's easier and less error prone to normalize data,
than to figure out if talking to a similar or a different system.
Unix
quite early
defined a 'standard' encoding order,
how data is to be written to file
and especially to network
(since Unix was networked, interchanging data).
Network byte order is
big endian byte order
[for historic, incidental reasons;
there is no reason to prefer one or the other,
(by the way: not even for human representations)].
Note that there are
different views on 'standard' encoding:
Windows systems rely on
little endian
(for historic reasons too,
since their first implementations were on Intel
x86
processors);
CORBA e.g. chooses to leave
the choice of endianess up to the user
[and includes an endianess flag in its encoding;
allowing for different encodings
(to allow a tiny bit of optimization);
it's not a good idea to do so:
it decreases the robustness of the data protocol].
[
convert from/to network byte order
/
document normalization
/
Java implicitly uses network byte order
]
-
How to convert to network byte order?
How to convert from network byte order?
-
Data conversions into
network byte order
are done manually
by using the
htonl
function family
['host to network byte order';
htons
, htonl
, htonll
,
ntohs
, ntohl
, ntohll
].
hton/ntoh operations are cheap
and happen to be no-ops
[identity operations]
for one class of systems
[the
big endian
systems].
With the
little endian
systems,
the ntoh functions are identical to the corresponding hton function,
reversing the order of the bytes.
[
network byte order
/
"[htonl] statement has no effect" warning
]
-
How do I find out whether I'm on a big-endian or little-endian system?
Byte order indicator?
-
[Again:]
You usually don't want / don't need to know.
If writing binary data to disk
or sending it over a network,
normalize
it first,
to the well-defined
network byte order,
(by using the
htonl
function family),
or use ASCII encoding
(i.e. use a more human readable
and system independent representation).
Use
the snippet
printf("%x\n", *(int const*)"abcdefghijklmnop");
to generate an endianess-code,
e.g. 64636261
for little endian, 32 bit
[ASCII encoding].
[
network byte order
/
document normalization
]
-
Nested loop guidelines?
Block nesting?
Nested if-statements?
-
Often loops
are iterations over data structures;
the loop body [code]
imposes specific actions on each data member,
and can be put into a [local] function
[that is given an appropriate name;
'if it can be made a {conceptual} abstraction,
put it into a function'].
This approach leads to smaller functions
and avoids loop nesting.
Loop iterators can be implemented generically
[have abstract functions
iterating over data structures,
calling application-supplied specific call-back functions
for each data member
{internal
iterators}].
Less nesting depth means less state
(and less complexity;
each loop context means additional state
in the form of indices and other iteration variables);
small and less complex functions
are readable and maintainable.
I would not declare nesting-depth metrics,
but rather an optimal range
of number of lines
[or function-points]
per function
[which would lead to less [or no] nesting too,
since loop frames and bodies count as LOCs too].
[Speed-]
optimized algorithms
sometimes use nested loops
representing
different orders of state change
[and may use
gotos
{that only allow to jump inside a function}
to change such states in fine-granular ways
{often
[the better abstractions than goto
?]
break
and continue
are sufficient}].
[Break-out of nested loops in C/C++ requires gotos too,
unlike with
Java's
multi-break.]
I prefer early returns to block nesting;
often 'normal' code runs down from top to bottom of a function
and special cases and error cases
can be handled in if-blocks with early-returns.
If-statements in the form of
if (...) {...} else if (...) {...} else if (...) {...} ... else {...}
[tail nesting]
are conceptually not considered nested
and are not indented
[K&R2].
[Tail-nested ternary statements
can be formatted in a similar way,
see e.g.
ipow sample.]
Complicated nested if/else blocks
usually are better handled with a revised better data model.
[
code nesting
/
strtok nesting
/
multiple returns
/
iterator pattern
]
-
malloc vs. calloc?
-
p = calloc(n, m);
is semantically equivalent to
p = malloc(n * m);
if (p != NULL) memset(p, 0, n * m);
,
typically meaning:
allocate an array of n data objects of size m bytes each
[m typically expressed in terms of sizeof
],
with implicit initialization to 0.
In C/C++ 0 has
a well defined and useful meaning
for the primitive types
{and hence in C for the aggregated types;
C++'s object composition includes additional stuff,
like pointers to meta data [needed for polymorphism],
which must not be 0}.
n*m is
the over-all size.
It is a good practice
to initialize
data [buffers],
especially dynamic memory,
before use,
to make code more
deterministic,
and robust
(which is especially of help at bug finding time
['searching for bugs that show non-deterministic symptoms
is an unpleasant task at best']).
In symmetry with initialization of
stack based data,
I prefer the malloc/memset approach.
Dynamically allocated memory is returned by free
[however, the heap {the area which serves malloc et al.}
typically doesn't shrink
{use mmap based allocation for that}].
[
determinism
/
program robustness
/
stack vs. heap
]
-
union and struct?
-
union and struct are syntactically similar,
struct defining a
sequence
of data fields
[with possible padding {emerging from alignment}],
union overlapping data fields
[with different types; all starting at offset zero].
-
How do I convert C++ single line comments to C comments?
-
A simple answer would be:
apply the regular expression substitute pattern
's,//\(.*\),/*\1 */,'
,
but that doesn't consider //
in 'problematic'
lexical contexts, such as in a string literal!
The better answer is to use a parser [more precise: a lexical analyzer]
such as
ccomment
['Care has been taken of: strings, character literals, constructs
like '//.../*...*/' and performance.'].
-
How to have variable arguments macros?
-
Macros with variable arguments are unfortunately not supported
in
[K&R2]
C
[macro syntax is too simple to support such things
{e.g. compared to Lisp's macro system}].
However, as macros implement by call-by-name, a
parameter list can be specified as one macro argument [this is
however not very flexible].
The newer
[9899:1999]
C standard
allows variable arguments macros
[providing the
__VA_ARGS__
meta macro],
that however supports
only simple things
[such as forwarding the argument list as a whole;
i.e. isn't very flexible either;
parsing the variable argument list e.g. is actually not possible].
C++'s inline code and template code [possibly
together with function overwriting] are acceptably flexible.
-
Are there macro debug guidelines?
-
Additionally to macro design guidelines such as
don't use macros with side effects
[e.g. with macro parameters being referenced not exactly once]
and
have the macro argument references fully parenthesized
[to suppress problems with
operator precedences],
some rules can be formulated on how to approach macro design
safely and make macro debugging easier:
If the macro expression is other than trivial, implement things as a
function [first].
If using C++ [instead of C], consider inlined functions [or
templates].
Debugging macros in the sense of stepping through them
usually requires them to be transferred into the form of functions [too;
which e.g. provides single code locations to set breakpoints].
[
assert debug macro
]
-
What about gets?
What are gets alternatives?
-
Don't use
gets
.
Use fgets
instead.
Why: possible
buffer overflows
that cause
program robustness problems
and severe security risks.
-
What is IO.H?
IO.H vs. io.h?
Where can I find io.h?
-
io.h contains [low-level] file handling and input/output
function declarations for the win32 platforms,
such as
access
or write
.
struct _finddata_t
[for directory traversal]
and the FAT file attributes [read-only, hidden, system,, archive]
can be found there as well.
Some of the io.h API maps to the Posix API
[on Unix found in unistd.h],
other parts are additions [such as chsize
].
io.h and IO.H are synonyms since win32 filesystems, such as FAT [FAT16,
FAT32] and NTFS, are case-insensitive.
Header files to standard libraries are, as C [and C++] development kits,
not included in standard win32 operating system installations;
they come with those development kits.
-
What is pascal style?
-
'Pascal style' names the order of placing local (static)
functions before their references in C source files [and hence
avoiding function prototypes].
Pascal style is not feasible with indirect recursion.
'Pascal calling style' names the convention of letting the
called C function clean up the stack [instead of the calling function;
this saves some opcodes].
Pascal calling style is not feasible with variable arguments.
[
coding guidelines' code order chapter
]
-
What are the hidden variables in the C language?
-
C and Posix do not really define many
global variables;
functionality based on interfacing through global variables is
inflexible in implementation,
breaks encapsulation
and poses synchronization problems.
Even
errno
is stated as
'integer expression' (not 'integer variable') in
K&R2
[and typically implemented by means of a function returning a
pointer {to thread-local data}].
Sometimes, environ
,
_argv
or __argv
,
etc,
are found,
but are proprietary and not
portable.
Hidden can be considered whatever is external
[accessible] in libc
and not defined in published header
files
[but of course should not be used!].
[
question on reentrancy
]
-
What does type safe mean?
Why do modern languages not use implicit declarations?
-
Type safe means that,
if possible,
data types used in assignments and
function calls, etc,
are checked for correctness at compile time.
This typically saves the programmer a lot of
debugging effort.
The C programming language is
a quite strongly typed language;
this means that
[data] types, variables and functions
must [should!] be declared
before their use
[see
my C overview].
This kind of type safeness is also called static type checking,
as opposed to dynamic type checking
that is additionally supported by languages such as
Java,
C++ [optionally, depending on type of cast and compiler flags] and
Lisp/CLOS,
requiring run time type information.
C's generic references,
that are often declared with
void*
,
can't be checked;
object-oriented programming languages use the safer
polymorphism
and inheritance [hierarchies] for that.
Older languages such as basic mostly didn't apply
static type safeness, leading to
[possibly] late run-time errors.
Modern counter-samples to typesafeness are:
C's unsafe/unchecked casts,
C/C++'s use of integers as booleans,
Java's polymorphic vector/hashtable/etc.
-
What means 'robust' in C/C++?
Robust code?
-
One key issue with code robustness is
robustness against possible [user] inputs:
a program should not crash or otherwise misbehave when fed with certain
[possibly unexpected] inputs.
The classic sample are
buffer overflows,
which often establish a severe security risk.
Another issue is robustness against code changes:
logical dependencies and function side-effects should
be minimized, to keep the risk of introducing code errors small.
Checking code constraints often
(see
assertions question)
is a counter-strategy here.
Other problems may arise from scalability:
algorithms should be chosen and implemented in a way that resource
consumption doesn't grow into extremes with unexpected or untested input
values.
Handle interface ambiguities conservatively:
don't rely on the behavior of one or several
specific implementations;
rely on the interface definition (i.e. manual page, etc) only,
and
test
the interface.
C: error return values must be checked consequently (C++ offers
more convenient formalisms with its exception system),
even test for such things as running out of disk-space (and indirect
consequences of such problems).
A wish for code robustness is a typical reason for
code reviews.
Robust code minimizes code maintenance costs.
Lean
implementations
are often more robust.
[Premature] optimization is often less robust.
-
What is portability?
-
Portability is a code property of being usable
[compilable, linkable, and runnable]
on a variety of computer systems.
Portability is often achieved by being conservative about 'features'
usage
[i.e. check whether a function is specified
{by a standard} and not just 'present',
don't assume specific sizes of data and data alignment,
don't deduce function semantics from tests
{but from
manual pages}].
Often portability considerations are done together with
robustness
considerations
[indeed: robustness against change of environment].
Portability [in e.g. C] is mostly an issue of library
[API] portability and not so much an issue of programming
language portability.
On
Unix
APIs are now more unified than earlier.
Java
does quite a good job
specifying APIs
[with the exception of useful things like
StringTokenizer
and tiny
details like Thread.getName
missing in
J2ME/MIDP/CLDC].
You might expect code of good quality to be portable.
System specific code is often separated from the core
code [or from most code] to keep that portable.
[
portability types
/
typedefs
/
minimizing Java porting problems by adding C/C++ keywords
/
unportable 'hidden' variables
/
robustness
/
C++ to Java
/
C to Java
/
Java to C++
]
-
How to port C++ code to Java?
-
Porting
C++
code to
Java
is probably easier than
porting Java code to C++
[leaving away performance concerns
in the first step].
Although C++ is a richer language compared to Java,
offering quite some additional features,
problems arising seem to be acceptably easy to cope with.
Performance considerations may partially be outweighed by
faster APIs (e.g. nio new I/O API)
[possibly using native code]
and faster hardware
[moore's law].
Some C++ language features deserve detailed consideration:
function pointers
[and member function pointers,
which are e.g. used for callbacks;
can usually be replaced by
defining and implementing interfaces],
constness fine-policies
[are not supported by keywords in Java;
need to be coped
by cloning at appropriate locations],
templates
[replaced by less generic classes or
polymorphism],
multiple inheritance
(not in the sense of {pure} interfaces)
[needs redesign],
friend access modifier
[usually replaced by package access].
Minor syntactical details include:
cast overloading
[tighter control over type conversions;
explicit conversion methods needed in Java],
operator overloading
[semi obscure, but allows short code {where appropriate};
not supported in Java;
needs to be replaced by method calls],
typedef
[for type name aliases;
use inheritance or the proxy pattern],
reference-type function parameters
[e.g.
int&
;
holder needed or slight redesign].
C++ things typically not needed include:
default parameter values
[overwriting is sufficient],
static function variables
[not useful with multithreading or
reentrancy],
macros
[unsafe because of side effects],
conditional code compiling
[separable to platform specific jars].
[
C to Java
/
Java to C++
/
differences between Java and C++
]
-
How to port C code to Java?
-
C
and
Java
have different design approaches;
Java is object-oriented and uses quite dynamic data handling;
Java is often centered around [class] framework development;
C is usually used in a non-OO manner and may use static data;
C is more often oriented towards
solving a single concrete problem,
often in a fast manner,
running on
lean
environments.
Porting C to Java often involves
a (complete) redesign.
[
C++ to Java
/
Java to C++
]
-
How to port Java code to C++?
-
Porting
Java
code to
C++
is probably more difficult than
porting C++ code to Java
because of C++ language's difficulty.
Porting requires
major memory management changes
[away from Java's garbage collection paradigm's
implicit cleanup;
some C++ environments provide heuristic garbage collectors,
autopointers
e.g. help to make resource allocation exception-safe].
[
C++ to Java
/
C to Java
/
differences between Java and C++
]
-
What is orthogonality?
-
Orthogonal
means 'unrelated'
and 'not influencing each other'.
[The term comes from vector geometry,
where an orthogonal vector
starts a new dimension;
the scalar product of two orthogonal vectors
is zero.]
Orthogonality in software development means:
things that are not related conceptually
should not be related in the system;
new features shouldn't break or change
existing ones.
In orthogonal systems,
a change at one end
doesn't/shouldn't affect the other end
[no ripple effect].
Orthogonal software components
often have no or minimized side effects,
are often small, uncoupled, pieces
and often freely combinable.
Orthogonal design
avoids 'special cases'
and restrictions.
This makes systems more intuitive to use
[{at least} for the systematically thinking].
['Special cases' may be a sign of missing design.]
[
programming language orthogonality
]
-
What is programming language orthogonality?
Orthogonal language?
Orthogonal C?
-
Programming language orthogonality
means the language is designed
so that all [or most] [possible] expressions have a meaning
[and therefore are 'defined'
{in the sense of language specification}].
Orthogonality may make a programming language
more intuitive to use.
C is a sample of
an orthogonal programming language.
C allows to chain and nest expressions:
i = j = 0;
f(g(), i++, ++j);
i = f(k = 2, h(g()), i);
;
sometimes it may be convenient
to have such language constructs.
[That however doesn't mean you have to or should use them.
Readability and debugging convenience will decrease
when using nested or chained expressions;
e.g. source code debugging doesn't show intermediate data,
sub-expressions can't be reassigned a value,
{at least not easily, without showing/using [cpu] registers},
etc].
Unless necessary
C doesn't lock in [certain] expressions [e.g. assignments]
to be a special statement case
[what possibly would make the language less usable
and the grammar more complex].
C's operators are defined for different data types
[e.g. math vs. pointer arithmetics];
sizeof is defined for
expressions [including variables]
and types.
[Counter-intuitively to some]
C's const
and volatile
keywords
are orthogonal
[i.e. don't exclude each other]:
const
means 'can't be modified by the code'
[relevant for putting data in {shared} read-only segments
and for contracts on data access policies];
volatile
means 'is subject to asynchronous change'
[relevant to optimizations done by compilers];
[9899:1999]
names the sample of a hardware clock
[mapped to an unwritable memory address],
declared using both modifiers.
Another sample of orthogonality are Lisp's sequence operators
that do not only work on lists but also on arrays, strings, etc
[the concept of sequences here is
somewhat orthogonal to the concept of lists,
however this may looked on as
polymorphism;
some consider Lisp not to be an orthogonal language
since often there's not only one single construct
implementing a concept,
e.g. if/else statement doable by
if
,
when
,
unless
,
cond
,
and
,
or
, ...
{question:
compact orthogonal set
or compact orthogonal working set?}].
[
orthogonality
/
C programming language overview
/
C programming language coding guidelines
]
-
What is polymorphism?
-
Polymorphism is
a core concept
in object-oriented programming:
a method call on an object
is dispatched
at run-time
by the concrete type of this object.
It enables
the use of
pure abstract types
(generic types,
in
Java:
interfaces),
and makes the following concepts
more useful:
inheritance,
class hierarchies,
partial implementations
(abstract classes)
and
multiple inheritance.
Not polymorphic [i.e. not object-oriented]
are e.g. the language constructs
dynamic_cast (C++),
instanceof (Java),
switch (C/C++/Java)
and
typeid
(C++).
C++ requires
the
virtual
modifier to make things
{methods and classes}
polymorphic
[since C++ too supports
{a bit faster}
non-polymorphic code behavior],
Java doesn't.
-
Do you have samples of syntax coloring?
-
No;
the
sample
(C++)
that should give you a hint what syntax coloring looks like,
has just colorized comments,
not colorized keywords, identifiers, etc.
[
diff coloring
]
Java programming language
-
What are the Java programming language specifications?
-
[The Java Language Specification,
Java SE 8 Edition,
James Gosling, Bill Joy, Guy Steele, Gilad Bracha, Alex Buckley,
Oracle America Inc.,
2014]
defines
the Java programming language.
Older versions were
[The Java Language Specification,
Java SE 7 Edition,
James Gosling, Bill Joy, Guy Steele, Gilad Bracha, Alex Buckley,
Oracle America Inc.,
2011]
and
[The Java Language Specification,
James Gosling, Bill Joy, Guy Steele, Gilad Bracha,
Addison-Wesley,
3rd Ed. 2005,
ISBN 0-321-24678-0]
and
[The Java Language Specification,
James Gosling, Bill Joy, Guy Steele, Gilad Bracha,
Addison-Wesley,
2nd Ed. 2000,
ISBN 0-201-31008-2]
and
[The Java Language Specification,
James Gosling, Bill Joy, Guy Steele,
Addison-Wesley,
1996,
ISBN 0-201-63451-1].
The Java language specifications can be found
in electronic form at
java.sun.com,
docs.oracle.com.
[
openjdk
/
Java keywords
]
-
What are the Java programming language keywords?
-
[JG/BJ/GS/GB/AB,2014]
defines
:
abstract assert boolean break byte case catch char class const continue
default do double else enum extends final finally float for goto if
implements import instanceof int interface long native new package
private protected public return short static strictfp super switch
synchronized this throw throws transient try void volatile while
;
The keywords
const goto
are reserved, though not used.
The words
false null true
[technically] are considered literals.
strictfp
was introduced with
[JG/BJ/GS/GB,2000],
assert enum
were introduced with
[JG/BJ/GS/GB,2005].
[
Java programming language specifications
/
abstract
/ final
/ private
/ protected
/ public
/ static
classes
/
interface
(to separate declaration and definition)
/
synchronized
features
/
use of volatile
keyword
]
-
What are the differences between Java and C++?
Java versus C++; Java vs. C++?
-
Java usually shows a better maintainability
(less pitfalls and less complicated language).
A listing of [technical] language topics can be found
here:
javacppdiffs.html.
-
Why is the const keyword not supported?
-
Assignment finality / constantness
(C++:
T* const t =
...)
is supported
by the final
keyword
(Java due to its flow analysis
also supports blank final fields and variables).
Constantness / immutability
of referenced objects / data
(C++: T const* t;
)
isn't explicitly supported;
it is usually contractual
(e.g. String
),
sometimes found in the type's name
(e.g. BigInteger
vs. MutableBigInteger
).
'Constness' is considered
rather logical than physical
with more abstract (object oriented) programming languages
(C++ as opposed to C);
C++ needs the additional mutable
keyword
in such contexts.
Physical constantness
(constant, usually non-OO, data put into read-only sections)
isn't used with Java's heap object data.
-
Why is goto not supported?
-
Java imposes more restrictions
on variable initialization
(which helps making code more robust);
the need for code flow analysis
(to check initialization or single initialization
[of blank final variables])
isn't compatible with [general]
goto
statements
(i.e. some gotos wouldn't be valid
or would make the code flow check infeasible).
However, labeled continues and breaks are permitted
(since no valid initializations are skipped
when e.g. jumping out of a for loop).
-
Why is multiple inheritance not supported in Java?
-
Not supporting multiple class inheritance makes the programming
language easier to use and implement, fewer pitfalls
occur (e.g. with diamond-shaped inheritance and its precedence problem),
fewer obscure features are present (e.g. [private]
class-mix-in).
The more important multiple interface ('data-less classes')
implementation is however supported in Java.
[
Java's design enhancements
/
about C++'s virtual multiple inheritance complexity
/
question on C++'s 'virtual' modifier
/
multilevel inheritance
]
-
What about multilevel inheritance (vs. multiple inheritance)?
Multilayer inheritance?
-
Deep class hierarchies are not necessarily cool (either).
[Class hierarchy layouts are subject to discussion,
even more so with increasing deepness.]
Multilevel hierarchies are often used to slice responsibilities
(in order to encapsulate/modularize).
[
multiple inheritance
]
-
How to separate function declaration from definition in Java?
-
The typical way to separate
interface declaration from definition (implementation)
in Java
is to define an
interface
declaring the methods (functions) in question
and provide (separate) implementation(s) for it.
The implementations can be instantiated
via
reflection
((interface)Class.forName(String).newInstance()
)
to totally decouple code
(at compile-time);
often this is accomplished [indirectly] via
a JNDI (service locator) framework.
-
Unsigned numbers in Java?
-
Often you don't need them
(in any language);
see
[C] coding guidelines.
[One additional bit of 32 e.g. isn't much;
type-safeness
isn't gained either:
overflows of primitive types are silently ignored.]
However: mind masking out the sign-extended bits
after widening a semantical unsigned number
[e.g.:
int unsignedValue; ...;
long value = unsignedValue & 0xffffffffl;
(similar masking is used in
BigRational code)].
-
Java byteorder?
-
The [underlying]
byteorder
is not visible in Java;
it's hidden by the VM.
You
wouldn't need it
anyhow.
-
What kinds of classes does Java support?
-
A Java class is either
a top-level class,
a member class or
a local class.
The latter two are called
nested classes;
member classes are either
inner classes
(with a reference to the enclosing object) or
static classes.
Classes can be
abstract,
final or
neither.
They can be
super (base) classes or
sub (derived) classes,
[only explicitly] neither, or
both.
Classes can have
public or
package access;
member classes can additionally have
protected or
private access.
Anonymous classes are
local, inner and [implicitly] final.
A non-final class can be used
[externally, if allowed by access modifiers]
as a super class;
classes are implicitly
[direct or indirect] sub classes of
Object
(i.e. Java's type hierarchy is single-rooted);
Object
is known to be a super class
(e.g. of classes defined in the java.lang
package
[and of course all additional classes]).
Nested classes were
introduced
in
[JG/BJ/GS/GB,2000].
-
What are Java's access modifiers?
-
private
,
default/package/friendly,
protected
,
and public
;
each being a subset of the following,
i.e. the access modifiers are not
orthogonal
(protected access includes package access).
Private fields and methods are only accessible by the implementing class itself,
default/package/friendly (no keyword) by all code in the package,
protected additionally by derived classes' code,
and public generally.
Having protected
implying default
makes the language simpler.
In a good object oriented approach
private
would be all data fields
and helper methods,
protected
only helper constructors
and some helper methods needed by derived classes,
and public
the API
(that possibly implements
a public interface
).
-
Why does Java not need header files?
-
Unlike
C and C++
with their libraries (and archives)
imposing least overhead [in size],
Java
includes the signatures,
return types and access modifiers
in the [compiled] byte code
(.class files and .jar files).
The information is needed too
for version and security checks at runtime.
[
how to separate declaration from definition
]
-
What does the synchronized keyword provide, what not?
What is thread starvation?
-
Java's
synchronized
keyword
does provide
guarding critical sections and
memory synchronization.
It [usually]
doesn't trigger
task switches,
which fact may lead to possible thread starvation:
one/some thread(s) don't get to run
[see
synchronized block based code sample
and
java.util.concurrent.locks based code sample
(both implementing [default] non-fair and fair monitor/lock use,
the first by explicit code)];
a higher thread priority doesn't help,
nor does [Thread.]yield
;
in a robust approach
only a kind of wait/notify paradigm
or
Doug Lea's
fair locks
[now integrated into Java 1.5 [API],
as java.util.concurrent.locks]
will work;
these are however collaborative constructs
(and constitute some {possibly unwanted} level of dependency).
A JVM's monitors could be implemented for fair scheduling,
but [typically] aren't,
for performance reasons.
Note that a multi-processor system may defuse
thread starvation problems a bit
[buy-decent-hardware [for [Java] servers] pattern].
[
flush by synchronized
/
double check locking pattern
/
reentrancy
/
reentrancy problems on multi-processor systems
/
thread starvation sample
/
thread deadlock sample
]
-
Is specific synchronization on an object needed to flush out its VM-local working memory copy?
-
No,
any synchronization should do.
Often objects contain dedicated sync-objects
for different synchronization/notification purposes,
which are supposed to synchronize the parent [referrer] object's memory too.
[
memory synchronization by synchronized
]
-
Shall I use Java's volatile keyword?
-
If you don't know: generally: no; rather use explicit
synchronization
at the appropriate places
[possibly using dedicated synchronization objects,
to avoid deadlocks,
and don't lock too long ;)].
A few cases where volatile fields may work are:
if updates are idempotent (e.g. on a hash value [cache] field,
such as
String
's, where however not even volatile
matters),
or if there's one writer thread only,
updating one variable (that actually is updated atomically) only.
Use of
volatile
earlier may have indicated a degree of cluelessness.
Some of its uses seemed to be broken
just like the
double check pattern.
Further it introduces dependencies
[and therefore increases complexity, and decreases robustness;
e.g. HashMap
's
transient volatile int modCount;
and its unguarded updates modCount++;
that are likely not atomic
(getfield modCount:I;
iconst_1;
iadd;
putfield modCount:I;
)
assume that it doesn't matter to lose
one [concurrent] update,
as long as not all relevant updates are lost
(that aren't, since modCount is only counted up,
and only one bit of information {'changed'} is used)].
Consider also using types of the
(also non-blocking) AtomicReference
family.
[
synchronized keyword
/
memory synchronization object
/
code robustness
]
-
What are the differences between enumeration and iterator?
-
Both
of the two Java interfaces
implement the
iterator
pattern
[EG/RH/RJ/JV,1994],
more specifically that of an external iterator
[external: the data structure advancing is controlled by the client,
which offers more flexibility].
A
java.util.Iterator
is
a semi semi robust iterator;
it additionally includes
a remove
method and
[better] takes care of the underlying data structure's
modification status
[one 'semi' here arises from the fact that
only remove functions
are considered but not insert functions,
the other from the fact that such operations are only allowed via
the iterator's methods
but not via the normal container methods;
asynchronous {multithreading} issues are
not considered {at all},
since they question the validity of returned elements
which are being processed;
omitting insertions makes
the iterator implementation simpler and
leaner
{and more
robust
in a different sense}
and avoids questions about iterator-after-insertion semantics;
removal during iteration is frequent
but insertion rare,
samples being
elements deregistering themselves].
[
race conditions with iterators
]
-
Can race conditions arise from concurrent inserts and removes on collections?
-
Yes.
E.g. Java 1.4's
AbstractCollection
implements
an unsynchronized method remove(Object)
with the help of an Iterator
,
which may throw a ConcurrentModificationException
,
e.g. in the case of the concrete class ArrayList
(which indirectly inherits from AbstractCollection
,
and whose Javadoc informs on this behavior).
Safer is the similar Vector
,
which implements a synchronized remove(Object)
.
Another alternative is the use of a wrapper collection
returned by Collections.synchronizedCollection(...)
.
Yet better is the use of a more appropriate data structure,
e.g. a Hashtable
when remove(Object)
operations are frequent
[Hashtable
is synchronized
and doesn't iterate].
[
read manual pages to obtain robust code
]
-
Sequence points in Java?
-
Java doesn't have the concept of sequence points;
the order of evaluation of subexpressions, arguments of method
calls, etc, is strictly defined.
This leaves less pitfalls to the programmer,
but also less room for expression optimization to the compiler.
[
sequence points in C
]
-
How would you implement multiple return values in Java?
-
As Java doesn't provide reference parameters,
multi-value returns can be implemented
by returning either an object of a helper class
[some overhead, an additional type],
or an array of
Objects
[not compile-time type-safe, primitive types have to be boxed]
or by using (additional) holder type parameters
[as e.g. in CORBA implementations].
[
multiple return statements
/
porting C++ code to Java
]
-
Should I use multiple return statements in a function?
-
Yes,
it makes function code less nested.
The use of exceptions
prevents
the single exit point paradigm anyhow.
[
return multiple values in Java
/
(avoid) block nesting (by early returns)
]
-
What is the Java Turtle?
Where can I find a Java Turtle implementation?
How to implement a turtle geometry?
-
The
Java Graphics Turtle
[known too from the
Logo
programming language]
establishes an
alternative point-of-view on low-level graphics programming.
A Turtle's graphics-state includes a position
(usually two-dimensional),
a heading (direction)
and such things as pen-up/down state, color, etc.
A Turtle's primitives [idioms] are directives such as
forward, left
, etc.
Samples of these
Java turtle graphics
can be found
here.
Insights into turtle geometry implementation may be won from
com/lrdev/turtle/TurtleCore.java
[part of the
com.lrdev.turtle package;
includes wrapping, clipping and scaling].
-
What are the Turtle keywords?
Turtle commands?
Turtle primitives?
-
back drawstate forward heading home left nowrap pendown penup right
setheading setx setxy sety towards wrap xcor ycor
/
bk fd lt pd pu rt seth
-
How to draw a fractal tree with a Java Turtle?
-
See
simplefractaltree.java
for a simple Java Swing based solution
(core being the recursive drawing method
void draw(Turtle turtle, double distance, double distanceFactor, double omega, int depth) {
if (depth > 0) { /* recursion-end check */
turtle.forward(distance); /* trunk */
turtle.left(omega); /* orient towards left branch */
draw(turtle, distance * distanceFactor, distanceFactor, omega, depth - 1);
turtle.right(2 * omega); /* orient towards right branch */
draw(turtle, distance * distanceFactor, distanceFactor, omega, depth - 1);
turtle.left(omega); /* back to origin */
turtle.back(distance);
}
}
),
and
fractaltree.java
(and
its
jar)
for a more complex solution
including GUI elements (sliders) to adjust parameters at runtime,
as well as coloring and different strokes (using Graphics2D).
[Java Swing types/components used:
JFrame (the application window),
JPanel (the paint window, and the sliders panel),
JSlider (parameter control),
JSplitPane,
BorderFactory/TitledBorder (parameter controls naming / value display),
ChangeListener (parameter change propagation, dependencies considerations (enabling/disabling additional controls));
Graphics2D types used:
Graphics2D,
BasicStroke,
Color,
Rectangle;]
[
recursion
/
hilbert curve
]
-
What is the Hilbert Curve?
-
The Hilbert Curve is a fractal pattern
[a
recursive
pattern,
that has an order {depth};
it is e.g. described in the header comment of
hilbert.java
or
here].
It is a good candidate for a simple sample of an animated
applet
[see
Java page].
hilbert.java uses a
Java Turtle
too,
but a much simpler one
[that e.g. can only change directions in ninety degree steps].
[
recursion
/
Moore curve
(a curve similar to the Hilbert curve,
but closed)
]
-
Where can I find the BigRational Java class?
BigRational: what is it?
-
BigRational.java
can be found
here (javadoc)
[and
here (source)].
The big rational number type lets you do the arithmetic
operations addition, subtraction, multiplication and division
without overflow and without loss of precision.
[The data type typically grows {concerning memory use and calculation
overhead} with its size and precision.]
Note that logarithms, roots, etc cannot be handled precisely with a big
rational number type, since their results typically don't fall into the
rational number domain.
[
about the BigRational testcases
/
about BCDs
/
question on integer overflows
/
rounding
/
add rational numbers
/
sample/simple big rational calculator
]
-
How to add two rational numbers in Java?
-
With
BigRational
s
you would either say
c = a.add(b);
or c = b.add(a);
(addition is commutative [and
associative]).
Java
doesn't support operator overloading,
so expressions such as
c = a + b;
won't be accepted for other than primitive types
[with an exception of String
,
so you could say
c = "+3/5" + "+7/11";
but that doesn't do the actual addition,
and relies on using an explicit sign
(i.e. a denormalized representation)].
Have a look at
BigRational.add(BigRational)
on how it's implemented by means of [big] integers
(representing numerator and denominator):
as an optimization, to avoid expensive operations
(gcd, multiplication) and object creation overhead,
the cases of one or both operands being zero are treated specially,
same denominator is treated specially,
one or both operands being an integer is treated specially;
all avoiding additional big integer multiplication(s).
The general solution is then
new BigRational(m_n.multiply(that.m_q).add(that.m_n.multiply(m_q)),
m_q.multiply(that.m_q));
(BigRational(BigInteger,BigInteger)
then normalizes numerator and denominator
to have
no common factors
[by means of
gcd]).
[
Java BigRational
/
add associativity
]
-
Can I use BigRational to convert binary64 to binary32?
-
As
BigRational
can read in all of binary16, binary32, binary64, and binary128,
and output to any of these types too,
BigRational.valueOf(double).floatValue()
can also be used
to convert binary64 (double) to binary32 (float);
however, a [[Java] language provided] simple cast
(float)double
has the same effect,
with less overhead
[memorywise and computational;
being in essence no more than stripping excess bits
and a simple rounding operation
(provided the exponent fits into the smaller size)].
-
How to convert big {integer|rational|decimal} to {double|int} without loss of precision?
-
With a conversion to a
double
[floating point number]:
if the big number contains more than 53 significant bits
(if representable as a IEEE 754 normal number,
else even less),
or represents a rational with quotient other than a power of two,
or carries a binary exponent of more than 1023:
you can't.
With a conversion to a int
[integer number]:
if the big number contains more than 31 significant bits,
or represents a rational:
you can't.
Easiest test
for precision loss
with
BigRational
to double
is:
NaNConsideringBigRational.valueOf(x.doubleValue()).equals(x)
(NaNConsideringBigRational.valueOf(double)
doesn't throw on Double.POSITIVE_INFINITY
,
unlike BigRational
).
[
rounding
]
-
Why does the big {integer|rational|decimal} number type not implement the concept of "Infinity"?
-
IEEE 754 floating point numbers
(the ones that are typically used,
on prevalent computer systems)
have a fixed size exponent representation
(e.g. 11 bits in the case of a double/binary64),
making overflow feasible
and predictable
(e.g. 1.0E200*1.0E200 resulting in "Infinity").
(The overflows (and NaNs) are propagated
in [composed] floating point calculation expressions,
to be optionally checked later.)
Such overflows are not directly occurring
with big numbers
[but number size is typically limited by available memory
(depending on the -Xmx option)
(and the range of exponent/bits size representable
by the typically used 31 bits:
big number size is typically not a big number itself)],
and not taken care of.
It's however not difficult to decorate such types
to catch a predefined 'overflow condition'
(e.g. applying a memory consumption threshold).
[
NaNConsideringBigRational
(implementing the notion of NaN, similar to that of Infinity)
]
-
Do you cover J2ME / MIDP / CLDC questions?
-
No.
[
J2ME portability
]
assembly code
-
Do you feature SPARC assembly samples?
SPARC sample code?
-
See
a sample of analyzing SPARC assembly code
for a small sample of SPARC assembly code.
That paper of course has a focus on analyzing such code.
-
What's the SPARC instruction set?
SPARC assembly opcodes?
SPARC assembly instructions?
-
The SPARC instruction set is defined in the according
architecture manuals, e.g. for SPARC version 8:
[The SPARC Architecture Manual,
Version 8,
David L. Weaver, Tom Germond,
SPARC International,
Prentice Hall,
1992,
ISBN 0-13-825001-4].
An AnswerBook2: SPARC Assembly Language Reference Manual may be
found online.
SPARC International
offered some
documents.
Processor specific definitions were found e.g. at
SUN,
e.g. for an
UltraSPARC-IIi processor.
sparcv8 code runs on e.g. older SUNs too;
sparcv9 code typically requires e.g. a SUN Ultra
[the way to address this issue better is to use platform specific
[dynamic] libraries].
-
What are SPARC synthetic instructions?
-
Some seemingly fundamental SPARC CPU operations are implemented by means
of other opcodes, often using the zero/sink register
%g0
.
These are called synthetic instructions;
the list is the following
[including the opcode(s) they are translated to]:
bclr [andn],
bset [or],
btog [xor],
btst [andcc],
call [jmpl,call],
clr [or,st],
clrb [stb],
clrh [sth],
cmp [subcc],
dec [sub],
deccc [subcc],
inc [add],
inccc [addcc],
jmp [jmpl],
mov [or,rd,wr],
neg [sub],
not [xnor],
restore [restore],
ret [jmpl],
retl [jmpl],
save [save],
set [sethi;or],
tst [orcc].
See too the sources mentioned in the
above question.
-
How to call printf from SPARC assembly?
-
First:
you don't want to do that:
usually,
assembly code is called by high-level-language code,
not the other way around;
assembly code is used for highly platform-adapted functions,
the adaptations are e.g. done for performance reasons,
and in that context it doesn't make sense to call
less optimized code
[such as
printf
].
Second:
see e.g.
here
[sample compiled from high level language]:
...; .s:; .ascii "hello, world\n\000"; ...;
sethi %hi(.s),%o0;
add %o0,%lo(.s),%o0;
call printf;
nop; ...;
Third:
the operations involved in calling a library function
[such as printf
]
are very specific not only to the platform [CPU],
but also to the chosen high level language and the compiler settings
(see e.g. __fastcall
pragma,
pascal stack cleanup mode,
volatile registers across function calls,
or
polymorphic C++ method calls).
[
convert [integer] to string [via s[n]printf]
]
-
Do you feature x86 assembly samples?
-
No.
But there's a
dump
of a (highly optimized) small x86 application, with
analytical comments.
x86 patterns such as shorter bytecode and pseudo-three-operands
opcode are discussed.
[
sample bitreverse.c x86 assembly
/
C++ relevant assembly dump snippets
/
non-recursive gcd assembly implementation
]
-
Which x86 CPU registers are typically considered non-volatile across C function calls?
-
That depends on compiler settings, compiler and (for external calls)
library policies.
Typically
esp
and ebp
[makes things less tricky if they are non-volatile],
as well as
esi, edi
and
ebx
are considered non-volatile
[i.e. more x86 registers are considered non-volatile than volatile].
See e.g.
x86 sample dump with comments.
-
What's the sense of xor eax,eax?
xor esi,esi?
-
xor eax,eax
clears (zeroes out) the register;
its x86 [ia32] CISC opcode
(33 C0,
or 31 C0) uses 2 bytes,
whereas mov eax,0
(B8 00 00 00 00)
uses 5 bytes;
so the former is at least more space efficient.
The obfuscation property of such opcodes shouldn't hurt,
as [[optimizing] compiler generated] assembly code was
never intended to be [easily] readable.
-
What is __set_app_type?
_except_handler3?
-
__set_app_type, _except_handler3,
etc,
as e.g. present as references in this
win32 x86 dump,
can be found in the
system specific C runtime libraries and sources
(sources are often available as part of the development system).
These functions are part of the C runtime code
and are implementation specific internals
(hence e.g. beginning with __
or _
[reserved identifiers
in
C]).
[The 'application type' that is set in the function in question
(e.g. console or windowing system)
is e.g. used to route diagnostic output.]
-
What is bcd?
-
Binary coded decimals (bcd) is a number format that
encodes decimal digits (0 to 9; 3.3220 bits of information) in 4 bits
(one nibble, half a byte) each.
Pocket calculators use[d] to calculate in (such) a
digital format to achieve a fast display stage [on {ancient}
slow processors!] (to give the calculator a better interactive response,
e.g. while rolling registers) and allow things such as the
hours-minutes-seconds (hms) format or the start-end-step
increment/decrement format, which would not be feasible with binary
floating point formats.
Digital formats avoid the kind of
representation and rounding errors that are unexpected
by typical users [who tend to think decimally {only}].
The
x86
opcode
daa
(decimal adjust after addition) implements
two-step bcd operations (e.g. adc/daa
pairs) and
does typically two digits (one byte only) per round, since one
carry flag is required per digit [x86 implements 'the' carry
flag and an additional 'auxiliary' {'nibble'} carry flag].
[x86 processors also support an unpacked bcd format
(with which the more significant nibble is ignored on inputs, and so can
be fed e.g. with ASCII digitals {\x30..\x39} directly).]
Precise base-10 representations can be implemented too by
scaled big-numbers (big-decimals) and big-rationals.
[
Java BigRational
]
-
What is 7efefeff?
0x7efefeff?
81010101?
0x81010101?
-
7efefeff is a bit pattern used in
optimized strlen implementations.
0x7efefeff is actually -0x81010101
[in two's complement representation], that, on addition to a
four-byte (32 bit) value, triggers carry-outs when encountering
nul bytes
[more explanations see
source code comments
of
strlen.c].
These carries are used to faster check for string termination
in chunks of 4 bytes (on 32 bit systems).
A similar algorithm can be used in memchr implementations, by
prior xor-ing the chunks with a pattern of the repeated
search-character.
Other patterns are
81010100 (0x81010100)
[and on 64 bit systems
7efefefefefefeff (0x7efefefefefefeff),
8101010101010101 (0x8101010101010101),
8101010101010100 (0x8101010101010100)].
7efefeff and other values mentioned are often found in
register dumps of crashed processes [or even in
operating system panic dumps {and panic messages}]:
if illegal string pointers are given as arguments to
strlen
or similar functions (strchr, memchr,
etc) [usually due to
a software bug], a process will dump on the first memory load,
probably after initializing registers with the above values;
the values may also be remnants of earlier function calls [such
functions are called frequently and unused registers get
not cleared].
[
more comments on an optimized C strlen C implementation
/
ffff0000 patterns
]
-
What is ffff0000?
aaaaaaaa?
-
The set of patterns
ffff0000 (0xffff0000), 0000ffff (ffff, 0x0000ffff, 0xffff),
ff00ff00 (0xff00ff00), 00ff00ff (ff00ff, 0x00ff00ff, 0xff00ff),
f0f0f0f0 (0xf0f0f0f0), 0f0f0f0f (f0f0f0f, 0x0f0f0f0f, 0xf0f0f0f),
cccccccc (0xcccccccc), 33333333 (0x33333333),
aaaaaaaa (0xaaaaaaaa) and 55555555 (0x55555555)
consist of bit groups
10101010..., 11001100..., 11110000..., ...,
and are
used in
binary integer logarithm calculation,
bit reversing
and
bit population count
[by recursive pairwise addition];
all have O(lb(bitsize)) runtime.
[Hexadecimal] patterns of repeating 3 or c [i.e.
#333333, #cccccc] are also found in numeric representations of colors
[more precise: of different gray levels] belonging to an
uniform 8-bit depth color palette
[see e.g.
'the browser safe palette'
{instructions on how to chose an image color palette to best fit a known
application's color palette, to suppress disturbing dithering
effects}].
[
patterns used in binary logarithm calculations
/
7efe... pattern
/
recursion
]
-
What's the meaning of bitscan?
-
The term bitscan usually describes the operation of finding the
first or the last occupied bit in an integer / register.
Bitscan can be found as opcode or implemented as function
[see question on
binary logarithm].
patterns
-
What does "pattern" mean?
-
In software design,
which this here is about,
pattern usually refers to what was popularized by
[Design Patterns:
Elements of Reusable Object-Oriented Software,
Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides,
Addison-Wesley,
1994,
ISBN 0-201-63361-2].
Doug Lea
[Concurrent Programming in Java:
Design Principles and Patterns,
Doug Lea,
Addison-Wesley,
2nd Ed. 2000,
ISBN 0-201-31009-0]
formulates the term pattern this way:
a pattern encapsulates a successful and common design form,
usually an object structure
(also known as a micro-architecture)
consisting of one or more interfaces, classes, and/or objects
that obey certain static and dynamic constraints and relationships;
patterns are an ideal vehicle
for characterizing designs and techniques
that need not be implemented in exactly the same way
across different contexts,
and thus cannot be usefully encapsulated as reusable components.
[
pattern catalogs
/
bit patterns:
bit patterns used in binary integer logarithm calculations
[e.g. 0xf0f0f0f0u],
bit patterns used in reversing bits
[same as for binary logarithm],
bit patterns used in optimized strlen implementations
[e.g. 0x7efefeffu],
/
fractal graphics patterns:
Hilbert Curve
]
-
What is the double check pattern?
What is double check locking?
Is double check locking broken?
-
Don't use double check locking.
The
double-check-pattern
assumes that [simple] read operations are system-wide atomic [and memory
is sufficiently synchronized] and applies a first unguarded
[unmonitored] data lookup as a speed optimization.
The double check pattern is mainly used in data initialization [e.g.
with singletons; done once only, hence a [potential] big lock overhead].
Instead of double check locking,
find faster lock implementations.
[[DL,2000]
writes about Java locking performance
[about]:
the overheads associated with concurrency constructs
decrease as
JVMs
improve;
the overhead cost of
a single uncontended
synchronized
method call
with a no-op body
is on the order of
a few unsynchronized no-op calls.]
In Java, singleton instantiation can be done in
static initializers [and class variables],
which are guarded against concurrency problems
by the class loader
(i.e. an external monitor is applied).
[
a warning: DoubleCheckedLockingIsBroken
/
Java's volatile keyword is broken
]
-
What is the difference between recursive locking and double check locking?
-
A recursive lock may internally use two locks
[two lock data structures],
whereas
double check locking
looks up [lock-] data twice.
The recursive lock [reentrant lock]
by definition doesn't block (deadlock) if
acquired more than once by the same thread.
Recursive locks can be implemented by means of [typically two]
non-recursive locks.
[Recursive locks are default with the
synchronized
modifier in Java, i.e. one synchronized method
can safely call another synchronized method of the same class.]
[
recursive lock C++ sample implementation
]
-
What is the 'let the user decide' pattern?
-
The
let-the-user-decide-pattern
is a user interface pattern, which intent it is to make
programs [or libraries, etc] more useful by allowing
configurations that don't make [optimal] sense at the first sight.
The typical approach is to not [yet] applying constraints that
seem straight-forward.
-
What is the 'sort up to a unique key' pattern?
-
When sorting data,
don't generate non-deterministic data output [order],
especially not if the recipient is an [end] user;
it is disturbing if an order cannot be deduced
or (worse) if it [slightly] changes from instance to instance.
Don't rely on data fields being 'sufficiently' unique,
unless they are specified [and enforced] to be unique
[or at least try your best].
Don't rely on a initial order being meaningful.
Don't rely on initial sub-orders being preserved
[unstable/non-preserving sort algorithms such as
qsort
may/will change initial sub-orders].
-
Do you maintain a pattern catalog?
-
No.
You may find one following the following links.
A good source too is
[The Pattern Almanac 2000,
Linda Rising,
Addison-Wesley,
2000,
ISBN 0-201-61567-3].
[
pattern wiki
/
Linda Rising's site
]
-
Which patterns did you formulate?
-
I formulated
such things as
the
'buy-decent-hardware' pattern,
'let the user decide' pattern,
the
'sort up to a unique key' pattern
and
the
'use two indices' parse pattern.
There are lots of implicit patterns,
some of which can be found in
these publications
and
[here],
e.g.
analyze-running-system [don't restart and lose state; use truss, pstack, etc],
dont-use-switch,
find-and-use-abstractions [find and use the right abstractions, be sufficiently abstract],
know-your-language,
read-specifications [be informed, be educated],
test-on-multiprocessor-systems [test threading issues on multiprocessor systems {too}],
use-normalization [normalize file contents before saving],
use-small-scope,
etc.
-
How do I implement an 'undo' operation?
-
Use the Command pattern
[EG/RH/RJ/JV,1994].
Java's
StringBufferStream
implements an undo
mechanism
[mark
;
as a stream extension;
see
javacppdiffs].
-
Is there a sample of a ring buffer?
-
sqfig.zip:sqenc.c:mergeLineFilter
(C sample)
implements a ring buffer
[a.k.a. circular buffer; this sample is used in a parser too].
The ring buffer features a begin pointer (current read
position) and an end pointer (current write position), which
are updated in the respective operations, while considering buffer wraps
and increasing buffer size if necessary.
-
Concurrent circular buffer?
Concurrent fifo queue?
-
More abstractly, a [concurrent] ring buffer (circular buffer)
is a [possibly bounded (having restricted maximum size)]
fifo (first-element-in-is-first-element-out) queue.
Java [1.5 and up; see
Doug Lea]
implements [
java.util.concurrent.
]
ArrayBlockingQueue
(space optimized, bounded)
and LinkedBlockingQueue
(time and concurrency optimized,
optionally bounded [implementing "a variant of the 'two lock queue'
algorithm: one lock gating entry to put/offer operations, the other
to take/poll operations; a count field maintained as an atomic
avoids the need to get both locks with put/offer/take/poll"]).
[Especially concurrent] queues
are used
for producer/consumer patterns
(to separate code [flows]
and make things asynchronous
[e.g. to avoid deadlocks
or make things more responsive])
and task [queue] / worker thread pool patterns
(e.g. to decouple network reads/writes
[and flow control] from processing
and to limit thread resources
[to make things more scalable]).
Queues are chosen to be bounded
to limit [memory] resource requirements
(to avoid out-of-memory scenarios
and system thrashing)
and/or implement flow control
(e.g. to make things fairer
and avoid worst-case response scenarios).
-
What is file normalization?
What is document normalization?
-
Documents are often laid out ignoring stuff such as
spacing [which includes line breaks {line wraps}] in their
input files.
Normalized [document] sources allow better comparisons
[including
diff
and version control tools] and are
sometimes better [human] readable [as unfortunately e.g. seen in RTF
document format usages].
[
RTF file normalizator
/
diff output readability
/
normalized sorted data (sort-up-to-a-unique-key pattern)
]
-
How to make diff output more readable?
-
Key patterns to making
diff
output more
readable are
normalization
(either implicit {such as with the -b
option}
or by filter {such as
rtfn}),
context (e.g. by -u
option)
and [possibly] outlining.
Tools such as
xemacs/emacs'
ediff [coloring] package are convenient
by providing context and fine-diff.
-
Do you have samples of testing patterns?
Do you use unit testing?
-
Yes.
com.lrdev.bn.BigRational.test()
implements
some 1114 (!)
testcases (for one
[fat!]
module!)
[only testing the public interfaces, leaving away function
aliases].
BigRational does not use the JUnit unit testing framework, for
code simplicity.
[
Java BigRational
]
-
What is recursion; recursive?
What is indirect recursion?
Recursion pattern?
-
A recursive function calls itself in the course of evaluating
its return value, typically with modified arguments [that are
usually getting 'smaller' in some kind, to let the recursion come to an
end].
A sample is a
recursive gcd implementation:
unsigned int gcd(unsigned int a, unsigned int b) {
return (b == 0 ? a : gcd(b, a % b));
}
.
The mathematical methods deduction and [transformed]
induction can be implemented by recursion.
True recursion involves more than one recursive call
[such as the two calls in
qsort],
otherwise it's easy to transform it into an iteration
[provided proper data structures
{e.g. a stack},
actually any recursion can be transformed into an iteration];
the gcd sample above shows an even simpler tail-end recursion.
Indirect recursion involves a flow-graph that eventually lets a
function be called again [e.g. a calls b calls c calls a again].
Deep recursion requires some
stack
space being available,
however truly recursive algorithms often work
with recursion depth log2(n)
[and hence small stack requirements].
[
typical recursion applications
/
recursive gcd implementation
/
bit count by recursive pairwise addition
/
Hilbert curve recursive [fractal] drawing pattern
/
recursion to iteration
/
recursive locking
/
stack overflow with recursion
]
-
How to avoid stack overflow with recursion?
-
(1)
Use
tail-end
recursion:
that allows you to work on linear structures
without additional
stack
requirements.
Doing explicit tail-end recursion can be seen as
converting the recursion into an iteration
[see
sample code].
(2)
On non-linear structures
[e.g.
balanced
trees]
take the shorter route first
[and the rest as tail-end recursion]:
that will cut the recursion depth
to less than log2 of the input size.
Such a small recursion depth will likely
not produce an overflow.
Sample:
reasonable
quick sort implementations.
It might not be so easy to recognize
the shortest path
[as it is with
quick sort;
consider the problem of finding
a way out of a labyrinth
solved by backtracking:
at each step
you don't know in advance
which direction will conceal the longest path].
(3)
If one can figure out a reasonable upper limit
for log2 of the problem size,
that can be used to limit the search depth:
if that limit is reached,
give up and try (all) other branches first;
performance will suffer
and maintaining an ordered list of branches-yet-to-take is required.
-
What are typical applications of recursion?
-
Of course as any recursion can be transformed into an iteration,
any iteration can be transformed into a recursion.
A typical case of such a [weak?, iterative?] recursion
is the Logo programming language's approach to use recursion
for list processing
[using its
empty?
predicate and butfirst
list
operator].
Stronger recursion is used
in divide-and-conquer algorithms
such as the
quick-sort
sorting algorithm,
in tree-traversing
such as in Unix' find
tool
[that scans hierarchical directory structures],
backtracking,
permutation enumeration
and many algorithms that work on graphs.
Geometrical applications include fractal curve drawing
and tiling.
[
about recursion
/
recursive gcd
/
Hilbert fractal curve
/
fractal tree
/
polyomino tiling
/
Logo Foundation
]
-
How to transform a recursion to an iteration?
-
The refactoring pattern of transforming recursive functions
to non-recursive, iteration using, functions typically introduces
state information, i.e. one ore more additional variables,
updated in each iteration step [a previous recursion step].
The transformation of a simple, tail-end, recursion
such as the
gcd sample
only involves updating the function parameter variables
[no additional state;
the sample's additional variable c is only required as temporary holder
in a swap operation
and can be discarded at the bottom of the loop].
The more complex sample of e.g. integer power calculation:
int ipow(int i, unsigned int n) {
return
(n == 0 ? 1 :
(n % 2 == 0 ? ipow(i * i, n / 2) :
i * ipow(i, n - 1)));
}
that can be transformed to
int ipow(int i, unsigned int n) {
int j = 1;
while (n > 0) {
if (n % 2 == 0) {
i *= i;
n /= 2;
} else {
j *= i;
n--;
}
}
return j;
}
requires a new state j
because of the i*ipow() multiplication.
j*ipow(i,n) can be seen as an invariant [and 'result']
for all steps.
[Note that the sample codes do not assert !(n==0&&i==0).]
[
recursion
/
recursion applications
]
-
What is quicksort?
What is quicksort's performance?
-
Quicksort is an efficient sorting algorithm
that in the average case runs with O(n*log2(n)).
Quicksort chooses a pivot element
[often pseudo-randomly to avoid worst cases
{O(n2)};
the median would be a good choice
{is however hard to find}],
which determines how to partition the input array;
larger elements are moved on one side,
smaller {and equal} ones on the other side
[in-place; no additional sort-space is needed];
the sub-arrays then are
recursively
sorted,
eventually up to pairs of elements,
which establishes the final sort step.
When recursively descending,
the smaller partition is chosen first
and the rest is done by
optimized tail recursion
to limit
stack
size use drastically
[recursion depth is maximally the binary logarithm of
the number of elements in the input array].
Quick-sort implements the divide-and-conquer paradigm.
[
C programming language's qsort
/
recursion
/
recursion applications
/
recursion to iteration
]
-
What are the typical questions on sorting algorithms?
-
The typical questions address
runtime efficiency
(as a function of input size, pre-ordering, key distribution, etc),
worst-case runtime behavior,
likelihood of that worst-case(s),
average case,
in-place data moves
(as opposed to scratch-space requirements),
stability
(keeping pre-ordering on otherwise equal elements),
overhead
(computational effort per iteration),
data access patterns
(e.g. important when sorting data on a single tape).
Samples:
quick sort
is efficient,
has a bad worst-case behavior,
which can however be made very unlikely,
moves data in-place and
is typically unstable;
insertion sort
is inefficient,
has however a low overhead,
moves data in-place and
is stable.
[
quick sort algorithm
/
comments on C's qsort
/
sort-up-to-a-unique-key pattern
]
-
What is lean software?
What are lean interfaces?
Lean software development?
-
Lean software design
may lead to less complex design
and to implementations that are more robust
[and show less security-flaws].
Lean software often has
a smaller footprint
and runs on smaller devices too
[and might run faster].
Lean interfaces are often
easier to use
and their use may be more robust;
lean APIs
and lean applications
may have a steeper learning curve
and therefore
[depending on the usage pattern,
e.g. frequency of use]
cause a better overall productivity.
Layering
[and facades and bridges]
may be an approach
to lean software design
[e.g.
Turtle
code
layered above
TurtleCore
code;
another sample are
multi-level layered
implementation hierarchies
and filter classes,
e.g. Java's input/output streams].
C
is a quite lean language
that can be used to produce
extremely lean executables
[therefore it does not even include
run-time size information or run-time type information
with variables, arrays, etc
{applications have to take care of this information explicitly;
C's
buffer overflow
problem
partly arises from this fact}].
C++
is a fat language
that can however be used to produce
extremely lean executables
[just like
C,
but requires thorough language insights!].
[Additional terms:
lean methods,
lean classes,
compact software,
fat interfaces,
monolith vs. framework,
compact command set vs. compact working set.]
Lean software development
means either
development of lean software
(discussed above)
or lean development of
[normal]
software,
which is rather about [development] processes
than about design.
Interesting approaches of such a lean development are:
find defects early
[late finding is expensive;
might lead to:
constant redesign / constant refactoring],
'delayed commitment'
[early developing in the wrong direction is expensive],
iterative development
[small steps;
proofs of concepts, proofs of functionalities, proofs of needs].
[
C
/
C overview
/
orthogonal design
/
robust code
]
-
A situation in which the add operator would not be associative?
-
Add operations on numbers can overflow,
so the expressions
((a + b) + c)
and (a + (b + c))
yield different results
with a and b [in a 32bit model] being int
s (32bit)
and c being a long long
(64bit)
[C
sample, -std=c99],
all three initialized to [e.g.] INT_MAX
:
the difference is whether only one or both additions
are done as long long (64bit) additions,
the first case leading to a [likely unintended]
intermediate integer overflow
(i.e. one bit is lost; later sign-extension issues arise).
[Note that an equivalent sample with short
s (16bit)
and an int
(32bit) won't overflow, since (in C)
[also short] integer calculations are done as int
s.]
Note that with integers, without widening,
(i.e. all three operands being e.g. ints),
overflow in [the usual] two's complement integer space
doesn't affect associativity:
e.g. (((INT_MAX-1)+(2))+(-1)) == ((INT_MAX-1)+((2)+(-1))) == INT_MAX.
Floating point number addition can overflow [too],
e.g. [on i686] with long double
s:
(LDBL_MAX*0.75) + (LDBL_MAX*0.5) + (LDBL_MAX*-0.375);
additionally rounding issues [due to limited precision]
can pop up, e.g. with 1.0 + (LDBL_EPSILON*0.5) + (LDBL_EPSILON*0.5).
[Note that on i686 double
s with equivalent expressions
based on DBL_MAX and DBL_EPSILON won't overflow nor differ due to
lost precision, since [in C] the whole statement (expression)
is calculated in the higher [CPU native] precision (long double).]
C++
allows operator overloading
and can't enforce [user provided] implementations
to be associative.
Java
supports the use of the add operator with Strings,
denoting string concatenation:
(("" + 2) + 3)
signifies "23",
("" + (2 + 3))
signifies "5"
(two type coercions from int to String vs. one).
Unix operating system
-
What is the Unix Man?
What is man man?
-
Unix
man
is an abbreviation for
the Unix [operating system] [online] manual [pages] [help] system.
Online means that the man
[shell] command
and the 'man pages' were/are found (if installed)
on UNIX (including Linux, Solaris, etc.) systems
in electronic form
[supplied additionally, to a printed version]
[installation sometimes was omitted to reduce disk space requirements].
In a pre-HTML/HTTP era the pages were served locally,
suited for text terminals
(i.e. with minimal layout;
based on the nroff
package
allowing to format for terminals of different capabilities
{e.g. of displaying bold/inverted/etc.}
and different printers
{on some making words bolder by printing
e.g. U\bUN\bNI\bIX\bX
})
and hyperlinks were preceded by
a 'see also' section
(towards the end of a man page).
Just type
'man man
',
or 'man man | more
'
to start.
Type
'apropos -?
'
to search in the index of installed man pages.
Several web servers offer gateways to man pages;
scope, options {and details} of commands, system calls and library
functions vary from Unix to Unix;
be careful to recognize proprietary stuff,
to stay
portable.
Unix Man
is also known as a well known song seen from the Unix-point-of-view.
-
Would you implement a method man?
-
Documentation, to which category manual text counts,
is considered different from implementation code,
so I wouldn't usually implement man methods.
Having manual text in code
[not in code-comments or javadoc {and similar}]
bloats up library size and is usually not needed during run-time.
Lisp and Mathematica usually include short manual texts in the code
[as first string in a defun-body or ::usage string].
Java
allows [now] for extended meta data
with language constructs (classes, methods, fields, constants, etc.)
in the form of
Annotation
s,
and to define a retention policy for these to keep
meta information beyond the compile process,
facilitating the inclusion of documentation/manuals
into the runtime system.
[
unix man (unix manual)
/
man man
]
-
Sparse files: what are they good for?
How to identify sparse files?
-
Sparse file
characteristics are just a [file system] abstraction.
Disk blocks are only written (allocated/occupied) if not zero
(empty);
Unix' lseek system call allows to skip bytes (that are
hence not allocated).
Find free sparse conversion code
there
too.
[GNU tools tend to handle sparse file characteristics as well.]
Sparse file identification is done by
ls -l
(real size [in bytes])
vs. ls -s
(allocated size [in blocks])
on the command layer
or by comparing stat
system call's
size information
to the number of blocks,
on the system call (C language) layer.
-
How to create a sparse file?
-
[Low level [C]]
lseek far enough between writes
(so that at least one disk block is entirely skipped
(mind block boundaries)).
-
How to convert to a sparse file?
How to copy a sparse file?
-
Use
this
tool,
or use GNU cp's
--sparse=always
option.
-
How get sparse file used size?
-
Use
ls -s
to get the allocated size in disk blocks
(vs. ls -l
, the real size in bytes).
-
What about the error: lseek: value too large for defined data type?
-
As lseek returns 'the resulting offset location as measured in bytes from the beginning of the file'
we won't be able to lseek beyond 2 GiB on a 32 bit OS (even with much smaller seeks than 2 GiB)
unless lseek64 is used instead of lseek[32].
sparsefile.c documentation
recommends
-DSPARSEFILE_LSEEK=lseek64 -D_LARGEFILE64_SOURCE, or -D_FILE_OFFSET_BITS=64
to avoid EOVERFLOW (75, Value too large for defined data type) in this case.
-
What is automount?
-
Automounting is a convenient way to
automatically mount
networked file systems
[triggered by processes accessing
directories, subdirectories or files
residing on such file systems].
As a sample,
the 'physical' file system
/export/home/x/
gets automounted to /home/x/
[/home/ designated to trigger automounts,
as e.g. specified in the according NIS+ tables],
thus allowing users to log on any system
in a networked environment
and finding their expected home directory.
If not used, automounted systems are eventually unmounted,
allowing least invasive maintenance on the NFS servers.
-
How do I interpret Unix process trace information?
How to view truss output?
truss patterns?
truss usage?
-
You might have a look at these
Unix process tracing patterns.
Some
truss
usage is given too, along with [small]
output samples
and truss output viewing strategies.
-
What is the purpose of tracing?
Why tracing?
Why are process tracing strategies important?
-
Tracing in the sense of
truss
allows
fast and easy
trace-data acquiration
with a quite deep level of detail.
One gains a quick overview
of what a process/program is doing,
where it's stuck,
what files it tries to find and/or open,
etc., etc.
-
How to understand and analyze truss output?
-
Gain a deep insight into
the semantics of the UNIX system calls,
e.g. via
[Advanced UNIX® Programming,
Marc J. Rochkind,
Pearson Education Inc. / Addison-Wesley,
2nd Ed. 2004,
ISBN 013-141154-3].
Such literature (among other things) enlightens
why certain operations/tests must be atomic
(open(O_CREAT|O_EXCL)
[e.g. to provide a means for (filesystem backed)
critical section handling
{allows also to extend the scope of a lock
via [appropriate] network filesystems}]),
and certain must not be atomic
(fork()/exec()
[to be able to close/dup filedescriptors
and adjust other process properties after fork and before exec
{makes the [create-process system call] interfaces simpler
and more orthogonal,
and decreases dependencies of client executables}]).
The advantages of certain provided semantics
(unlink(argv[0])
{uninstaller that is able to remove itself},
and unlink(open())
{temporary scratch file that is cleaned up after program crash})
are illuminated too.
The philosophy of a reasonable operating system
might become more lucid via studying its system calls
(or as Rochkind put it into the preface:
"(...) System calls define what UNIX is.
Everything else (...) is build on this foundation.
(...) When one describes UNIX as
elegant, simple, efficient, reliable, and portable,
one is referring not to the commands (programs, applications)
(...) but to the kernel.
(...)").
[
how to trace a process
]
-
Is Err#25 ENOTTY a severe Unix system call return value?
-
No.
Some system call errors don't arise from unexpected behavior,
but from a [simple] test,
e.g.
isatty
[check input/output for interactivity,
e.g. to check whether output buffering is appropriate
(or rather autoflush is appropriate):
it isn't in the interactive (tty) case,
involving some performance penalty]
eventually issuing a ioctl(...,TCGETA)
call,
returning with an errno
set to ENOTTY
.
Judging the severities of system call errors and
exceptions is often not an easy thing to do
in intermediate software layers such as middleware or
library code
[apparent if there trying to figure out on what log level
such a condition should be traced].
[
how to trace a process
/
Unix process tracing patterns on system call errors
]
-
Can the stat system call be sleeping?
-
Yes,
consider accessing a file on an
auto-mounted
nfs file system
and the nfs server doesn't respond or doesn't respond fast enough.
The same may apply for a slow file-system carrying device
such as a slow usb stick,
or a device that requires the user to key-in a pin
before information can be accessed.
[
how to trace a process
/
sleeping process status
]
-
How can I tell which Unix process is writing to a file?
How can I tell which files get opened by a process?
How can I tell what is written to a file?
-
The
lsof
tool
will tell you
which Unix process is writing to a file;
the
truss
tool will tell you
which files get opened by a process
and
what is written to a file.
-
How can I tell which process is listening to a TCP port?
How can I trace TCP communications?
-
Listened-to TCP ports
are visible in the
lsof
output, too,
and in the netstat -tunaWp
output.
The
truss
tool lets you trace TCP communications
on the user level;
network capturing tools
(either on the network interface [Unix device]
or on inter-connecting hardware)
show you what's happening
on the TCP protocol layer
(including retransmits, timing, etc).
Tracing UDP is very similar.
-
truss permission denied?
-
(1)
You need credentials
to trace a process
(relevant when trying to attach to a process);
usually you need to have the same user id;
root can of course truss all
processes
[reasons being:
you'll see part of the data flow,
that may otherwise be protected by file/device access permissions, etc].
(2)
On restricted systems
(including 'hardened' systems),
access to debug/analyzing tools
may be restricted,
since the tools may allow
possibly unwanted insights.
[Makes middleware debugging uglier on such systems.]
Other possibly restricted tools are:
netstat, ifconfig, ps.
(3)
non-root run truss
is not allowed to trace execs of set-id set executables
(truss: cannot trace set-id).
-
truss: cannot trace set-id?
-
A tracing process needs to run as root to trace execs of set-id executables,
in order to maintain the security model
((in this case: the parent) process of a different user id
may not inspect process' memory);
see e.g.
[MR,2004].
[
truss permission denied
]
-
What is the meaning of: process is traced?
-
Usual modes of attaching a process
(in order to trace, debug, or control otherwise)
allow only one party to attach to a process;
attempts to also trace/debug an already attached process
will fail with mentioned message or similar
(e.g.: 'attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted').
-
How to show running processes in Unix?
Unix system load level?
-
A Unix system's process status
(including its 'state' in the narrower sense,
see below)
is displayed by the tool of the same name,
ps
,
using appropriate options
[e.g. ps -elf
or s in e.g.
ps -e -o pid,ppid,user,s,psr,nlwp,stime,time,args
{Solaris;
see ps
man page}].
Running state
(including online state
[a running process actually assigned to a processor])
is an 'active' process state,
in contrast to e.g.
the sleeping state
[e.g. waiting on user input, a timer, child process termination, etc.]
or the stopped state
(also known as traced state)
[suspended by {a shell's} job control or a debugger, etc.].
The top
tool allows
to filter process information by [running] state
[use the i
(idle) option;
for older versions
see this
stopped-processes-considered-idle patch
to filter stopped processes too;
however,
'stopped' is somehow orthogonal to 'sleeping',
Solaris' pflags
tool
e.g. displays according processes as both
{PR_STOPPED|PR_ASLEEP}].
The uptime
command
will show you the load level
[how many processes / threads are 'runnable']
over 1, 5, and 15 minutes
[so does the top
tool too].
[
how to trace a Unix process
]
-
Why is a Unix process 'sleeping'?
Why is read sleeping?
-
The [Unix] operating system knows a number of
process states;
some of them
runnable
(ready to continue with the execution of user code),
some idle/blocked/sleeping
(waiting to be notified of some [specific] event
(e.g. network input available)).
The sleeping state merely means 'waiting for something';
with the
sleep
function it specifically means 'wait for a specified time'.
[
how to display Unix processes' state / how to show running processes
/
stat sleeping
]
math
-
What rounding modes are known?
What is ceil, floor?
-
Rounding [non-integer] numbers can be done
towards or away from zero [down/up],
towards or away from negative infinity [floor/ceiling],
or towards or away from its even neighbor [even/odd];
all either in any case
or only in the case of equidistant integer neighbors
[where the fractional part is .5],
giving a total of 12 rounding modes.
As default rounding mode
is usually considered half-up,
i.e. rounding to nearest neighbor
and .5 away from zero.
Java's
BigDecimal
[which methods divide and setScale require
a rounding mode parameter]
does/did implement
ROUND_UP,
ROUND_DOWN,
ROUND_CEILING,
ROUND_FLOOR,
ROUND_HALF_UP,
ROUND_HALF_DOWN,
ROUND_HALF_EVEN
['this is the rounding mode that minimizes cumulative error
when applied repeatedly over a sequence of calculations'],
and additionally ROUND_UNNECESSARY
which is used to only check whether rounding is necessary
[or rather
assert
that rounding is not needed].
BigRational.java
[doc]
additionally implements
ROUND_HALF_CEILING,
ROUND_HALF_FLOOR,
and ROUND_HALF_ODD.
[Missing are
ROUND_EVEN
and ROUND_ODD,
which would not commonly be used.]
Ceiling and floor are operations
naming the rounding of a number
towards 'ceiling' or 'floor',
i.e. positive infinity or negative infinity.
[
floor(){return round(ROUND_FLOOR);}
,
ceiling(){return round(ROUND_CEILING);}
and
truncate(){return round(ROUND_DOWN);}
.]
For a sample of an implementation of rounding,
see
Java BigRational
too.
-
What is the difference between remainder and modulus?
-
Remainder and modulus differ in the cases in which
the numerator's and quotient's signs differ.
The remainder is based on
rounding
down (towards zero; truncate):
[trunc(5,3)==1, rem(5,3)==2]
5==1*3+2,
-5==-1*3+-2,
5==-1*-3+2,
-5==1*-3+-2;
the remainder gets the numerator's sign.
Modulus is based on
rounding
floor (towards negative):
5==1*3+2,
-5==-2*3+1,
5==-2*-3+-1,
-5==1*-3+-2;
the remainder gets the quotient's sign.
-
What is gcd?
Where is it used?
-
GCD is an abbreviation for greatest common divisor
[number theory, mathematics].
A non-recursive gcd
C
implementation would be
unsigned int gcd(unsigned int a, unsigned int b) {
while (b != 0) {
unsigned int c = a % b;
a = b;
b = c;
}
return a;
}
,
a
[tail-end]
recursive gcd C implementation would be
unsigned int gcd(unsigned int a, unsigned int b) {
return (b == 0 ? a : gcd(b, a % b));
}
.
Another
[[non-tail-end (due to the <<1 terms)]
recursive]
implementation,
that doesn't use any
[expensive]
division/remainder operation, is
unsigned int gcd(unsigned int a, unsigned int b) {
return
(b > a ? gcd(b, a) :
(b == 0 ? a :
((a & 1) == 0 && (b & 1) == 0 ? gcd(a >> 1, b >> 1) << 1 :
((a & 1) == 0 ? gcd(a >> 1, b) :
((b & 1) == 0 ? gcd(a, b >> 1) :
gcd(a - b, b))))));
}
(See
[The Art of Computer Programming,
Volume 2: Seminumerical Algorithms,
Donald E. Knuth,
Addison-Wesley,
3rd Ed. 1997,
ISBN 0-201-89684-2],
p. 338).
[The
iterative
analogon would about be
unsigned int gcd(unsigned int a, unsigned int b) {
int i = 0;
if (b == 0) {
return a;
} else if (a == 0) {
return b;
}
while ((a & 1) == 0 && (b & 1) == 0) {
a >>= 1;
b >>= 1;
i++;
}
if ((a & 1) == 0) {
aeven :
do {
a >>= 1;
} while ((a & 1) == 0);
} else if ((b & 1) == 0) {
beven :
do {
b >>= 1;
} while ((b & 1) == 0);
}
if (b == a) {
return a << i;
} else if (b > a) {
b -= a;
goto beven;
} else {
a -= b;
goto aeven;
}
}
.]
A gcd is e.g. used in
memrot.c
[memory rotation, analogous to
C's
memmove
,
doing without scratch space]
to check a data buffer's symmetry properties
[memrot
itself is used in this
simple Ziv-Lempel (LZ77) encoding implementation
to allow multipart overlapping memory copying].
Another sample is
BigRational.java,
that uses
a gcd
for normalizing its rational number representation.
[
recursion
/
recursion to iteration
/
hp RPN gcd implementation
]
-
Gcd of negative numbers?
-
Gcd
does not make optimal sense with operands of different signs:
gcd(6,4)==2
[6==3*2, 4=2*2];
gcd(-6,-4)==-2
[-6==3*-2, -4=2*-2];
now
gcd(-6,4)==gcd(6,-4)== 2 or -2?
[-6==-3*2, 4=2*2 or -6==3*-2, 4=-2*-2?].
-
Non-recursive gcd assembly implementation?
-
33 D2 F7 F1 8B C1 85 D2 8B CA 75 F4
[x86,
32bit,
meaning
xor edx,edx;
div eax,ecx;
mov eax,ecx;
test edx,edx;
mov ecx,edx;
jnz -12
].
[
gcd
/
gcd C implementation {used for this sample}
/
on x86 assembly samples
]
-
How do I calculate the golden ratio?
Where is the golden ratio used?
-
In
dc
[Unix'
desk calculator program]
e.g. by:
8k 5v1+2/ p q
(half of one plus root of five).
The solution comes from solving
the
quadratic equation
r:1=(r+1):r.
The golden ratio is e.g. used in visualizations, e.g.
here.
-
Quadratic equation?
-
Solutions to quadratic equations
[by use of a hp(42s)]
can be found
here
and
here.
The
golden ratio
arises from such a solution.
wine
-
Is wine something special?
-
No,
every domain has its idiosyncrasies.
-
Is wine healthy?
-
No,
MDs say:
wine is not healthy.
However,
there's the
French
paradox;
which however may be based on a pseudo-correlation.
-
What means Volnay?
Where is Volnay?
-
Volnay is a pleasant place in
the Côte de Beaune, Côte d'Or, Burgundy, France, Europe;
south of Pommard and Beaune.
Volnay is a very fine and pleasant Burgundy red wine,
e.g. from Remoissenet or d'Angerville.
This
is what Volnay looks from
an oenological point of view.
-
Montrachet?
-
Montrachet is a wine domain
located south of Meursault and
Volnay,
split among the villages Puligny and Chassagne;
yields the best white wines
(subject to
personal taste).
Ernst Meier
writes [about]:
maybe the greatest and most complex Chardonnay wines,
against which the most ambitious wine makers [in the whole world]
are measured;
ideal terroir and most experienced wine makers.
-
Is there a new Bordeaux wine classification?
A new Bordeaux classification?
-
Yes,
in his
editions of Bordeaux Total
(e.g. the year 2000 edition
[Bordeaux Total,
René Gabriel,
Hallwag,
2000,
ISBN 3-7742-5279-3]),
René Gabriel
published
a new Bordeaux wine classification
that, unlike the old classification (of 1855), includes the St. Emilion and Pomerol
regions.
Some more information can be found
here.
Among the
Médocs
he favored
[over the years]:
Cos d'Estournel,
Ducru-Beaucaillou,
Gruaud-Larose,
Lafite-Rothschild,
Latour,
Léoville-Barton,
Léoville-Las-Cases,
Margaux,
Montrose,
Mouton-Rothschild,
and as second growths
Grand-Puy-Lacoste,
Léoville-Poyferré,
Lynch-Bages,
Palmer,
Pichon-Longueville Baron,
Pichon-Longueville-Comtesse-de-Lalande,
Rauzan-Ségla.
-
Other Médoc selections?
-
Hubrecht Duijker
favors about the same
17
(here 16;
excluding Lacoste and Pichon-Baron,
including Malescot-Saint-Exupéry).
My
closer Médoc selection is:
Cos d'Estournel,
Ducru-Beaucaillou,
Latour,
Léoville-Las-Cases,
Palmer.
My broader Médoc selection additionally includes
at least:
Carruades,
Clos du Marquis (e.g. 02),
Forts de Latour,
Glana (e.g. 98),
Gloria (e.g. 05),
Lagune (e.g. 01),
Léoville-Poyferré,
Lynch-Bages,
Marquis de Terme,
Mouton,
Tour Haut-Caussan.
[
René Gabriel's selection
]
-
Which are the best Pichon-Longueville-Comtesse-de-Lalande vintages?
-
According to
René Gabriel
2003, 2002, 1994, and maybe 1982 and 1926.
-
What about the 1988 Bordeaux vintage?
-
The 1988 vintage is a very interesting vintage
in the Bordeaux region.
Lots of [fine] tannins.
Be careful on the sediments.
I loved the 1988 Cos d'Estournel for a long time
[and still do].
In France,
1988 is the first of what they call
the "vous-savez-c'est-la-trilogie" vintages.
[1988 is generally, as the 1989 and 1990, a good vintage
in whole middle Europe.]
[
K&R2@1988
/
hp42s@1988
/
Mouline@1989
/
Krug@1989
]
-
Châteauneuf-du-Pape recommendations?
-
I recommend:
Beaucastel,
Clos des Papes,
La Gardine,
Mont Redon.
Note that most Châteauneuf-du-Pape producers
produce both red and white wines
and additionally special cuvees.
-
What are the advantages of France?
-
The most classical and best-aging Cabernet Sauvignons
(e.g. from
Pauillac
and
St. Julien)
[though many alternatives world-wide],
the finest Pinot Noirs
[alternatives from Oregon/USA and New Zealand],
Syrahs
(e.g. Guigal's La Mouline 1989)
[alternatives from Australia],
the best fine Chardonnays
(e.g.
Montrachets)
[alternatives world-wide],
[dry] Rieslings
(e.g. Muré's St. Landelin),
the best
Champagnes
(e.g.
Krug
1989),
and some more,
all defined and define standards.
France hosts a rich amount of
exquisite producers
and does so since a long time.
-
What questions arise with wine?
-
First there's the question about a wine's quality,
which often seems to be the predominant one
[for wine enthusiasts].
Guides like
[Ernst Meier, 2003]
almost exclusively focus on quality.
Extraordinary quality depends almost always
on a wine's vintage.
Then there's (personal) taste.
Sweetness, tannins, opulence, wood
are some of the key properties
with respect to one's taste
[of course, there are subtleties too,
e.g. whether one {generally} prefers Chablis or Meursault].
A further aspect is the price/value ratio.
Wine prices and price/value ratios
differ tremendously.
Some guides
and web sites
are dedicated to this question.
A similar question is on
where to find a specific wine/vintage.
Another question is whether one prefers
early drinkable wines or vintages
[in the
Bordeaux
e.g. 1992 and 1997 were early drinkable vintages,
1994 and 1998 were not].
Yet another questions is
whether one likes to taste wines
that potentially need much aging,
early
[in its fruit-phase, often tremendously opulent].
Some like very mature wines
[old red Burgundy wines are e.g. quite different from young ones];
some tolerate or like oxydation notes
[of course Ports and Amarones show such notes from start on].
-
How to start with wine literature?
-
Maybe start with wine meta literature
that tries to analyze the wine literature market.
Such books try to discuss
'whom to trust?'
and might also provide insights into such meta topics such as
why you don't find certain wines in certain parts of the world
or why lots of people overrate (or underrate) local wine.
A good
and probably timeless
introduction
in how to assess and know wines
provides
Michael Broadbent
with his books
Wine Tasting, Enjoying, Understanding
/
Wine Tasting: How to Approach and Appreciate Wine
[many editions with different publishers;
translated to at least nine languages,
e.g. to German:
Weine pruefen, kennen, geniessen];
includes also technical tips
on how to organize tastings and cellars.
-
Has wine received more attention recently?
-
Yes,
seemingly.
-
What about cork vs. silicon?
-
Since acceptable cork
[good cork, well manufactured cork]
is expensive,
it's not usually used with cheap wines.
Cheap cork is quite frequently
faulty-manufactured or infected,
and therefore often responsible for faulty wines.
So it's wiser to use silicon instead of [cheap] cork
with cheap wines.
It is said that
Gaja forces his cork suppliers to choose good cork sources
by demanding especially long pieces of cork
[if you ever wondered about their size...].
-
Does silicon mean 'cheap wine'?
-
No.
Cheap wines should use silicon instead of cork
for reasons mentioned
here.
Expensive wines are free to use both cork or silicon.
However it's not [yet] clear if aging is problematic with silicon.
[
cork vs. silicon
]
-
Do you feature pictures / visualizations / descriptions of the ISO tasting glass?
-
Yes, information and virtual reality modeling language (VRML) visualization
of the ISO 3591:1977 tasting glass can be found
here.
For availability of ISO standards, see
question on ISO/IEC 9899:1999.
[
where to get the ISO glass {in Zürich}
]
-
Where can I buy wine in Zürich?
-
I had very nice wines from
Baur-au-Lac,
Globus,
Mövenpick at
Jelmoli
and at
Enge,
Kummer.
Nearby at Zumikon there's
Cave BB.
-
Which Zürich restaurants do you recommend for an exquisite wine list?
-
I'd recommend
Caduff's Wine Loft,
Caveau Mövenpick,
Kunsthaus,
Lindenhofkeller,
Lipp.
[Michelin
(Suisse) 2014 [ISBN 978-2-06-718902-7]
recommends
(by the symbol 'A particularly interesting wine list')
Bü's,
Caduff's Wine Loft,
Florhof,
Lindenhofkeller,
Pavillon (at Baur au Lac),
Rigiblick - Spice,
Sonnenberg,
The Restaurant (at The Dolder Grand),
Uto Kulm,
Widder Restaurant.]
-
Where to buy wine glasses in Zürich?
-
Riedel, Spiegelau, Schott, etc:
e.g.
Globus (@Bellevue),
Jelmoli.
ISO:
e.g.
Sibler (@Münsterhof).
-
Where can I buy wine in Aarau?
-
Passion du vin,
Weinkellereien (@Rohrerstrasse).
-
Which Aarau restaurants do you recommend for a good wine list?
-
[Question under revision.]
-
Where to buy wine glasses in Aarau?
-
Difficult now;
see
wine glasses in Zürich.
-
What about Swiss wine?
-
As
Switzerland is totally surrounded
by all more-or-less famous wine-producing countries
(France,
Italy, Germany, Austria),
most regions in Switzerland [are able to and do] produce wine.
Some consider the Merlots from the Ticino the best Swiss [red] wines.
[Among some more: Riflessi d'Epoca, Sassi Grossi, Sinfonia
and Montagna Magica, Vinattieri].
Others love the Pinot Noirs from the 'Bündner Herrschaft'
(villages Fläsch, Malans, etc);
e.g.:
Gantenbein,
Hermann,
Pelizzatti,
Studach,
Tscharner.
Ernst Meier
[Der grosse Westentaschen Weinkenner 06/07,
Ernst Meier,
Barrique,
10th Ed. 2005,
ISBN 3-905126-09-5]
may serve as a guide for the different regions
[the book covers the whole world {of wine};
9th Ed. 2003
ISBN 3-905126-08-7;
8th Ed. 2001
ISBN 3-905126-07-9].
His top ten
(in 2005) were:
Castello Luigi,
Conte di Luna,
Klausener: Gran Reserva,
Montagna Magica,
Orizzonte,
Pio della Rocca,
Platinum,
Rompidée,
Ronco Balino,
Rosso dei Ronchi,
Trentasei,
Tenimento dell'Ör: Syrah,
Vinattieri,
for the [red] Ticino wines;
and
Fromm,
Gantenbein,
Grünefelder,
Marugg,
Mattmann,
Obrecht,
Pelizzatti,
Stäger,
von Tscharner,
Wegelin,
for the [red] Bündner Herrschaft wines.
-
Is there a german language version of this part of the f.a.q.?
-
Yes,
it can be found
here.
RPN
-
Where can I find information on a hp42s?
Where can I find pictures of a hp42s?
Where can I find information and pictures on other HPs?
-
The hp42s (HP-42S) RPN Scientific Calculator
[code name Davinci, family Pioneer]
was produced from 1988 to 1996.
Information [technical data] on it can be found at
finseth.com,
a picture of hp42s #3319S14906 can be found
here.
[
hpmuseum.org with excellent HP pictures
/
information on HPs
/
HP manuals offer @hpmuseum
/
how to program the hp42s and similar hp calculators
/
interesting sources for the hp42s
]
-
Where can I find a hp42s manual?
-
Manuals on the hp42s RPN Scientific Calculator
and other HP models are offered at
'HP manuals offer at HP museum'.
-
What are stack based programming languages?
Samples of stack based languages?
-
Stack based programming languages such as
PostScript,
Forth
and
the Java virtual machine (JVM)
implicitly use a stack
[to transport function arguments and carry temporary results].
They are often designed to allow fast interpretation of
language tokens
(which themselves come in a compact stream
over network
{to printers (print jobs),
web browsers (applet code),
etc}
or from limited storage media {in embedded systems, etc}),
often on small devices.
Stack based languages use a postfix notation syntax.
[
PostScript code sample
/
why-Forth? article
]
-
Does hp implement a stack programming language?
-
With the fixed-stack models
[RPN;
e.g.
hp42s]:
Not really.
The stack is mainly used for calculations;
it's too small on the [old] hps.
The program tokens
still have one [small] operand
(as in
RCL 7
;
typical
stack languages
usually have none).
The RPL models
[e.g.
hp28s,
hp50g]
do implement a stack language.
[
RPL stack functions
/
RPN stack functions
]
-
What are the RPL stack modifier functions?
-
(hp28s:)
CLEAR
[equivalent: DEPTH DROPN],
DEPTH,
DROP
[equivalent: 1 DROPN],
DROP2
[equivalent: 2 DROPN],
DROPN,
DUP
[equivalents: 1 DUPN, 1 PICK],
DUP2
[equivalent: 2 DUPN],
DUPN,
OVER
[equivalent: 2 PICK],
PICK,
ROLL,
ROLLD,
ROT
[equivalent: 3 ROLL],
SWAP
[equivalents: 2 ROLL, 2 ROLLD].
[
sample/simple big rational calculator
implementing part of the RPL stack operations
]
-
What are the RPN stack modifier functions?
-
(hp42s:)
RCL ST X
[RPL equivalent: DUP],
RCL ST Y
[RPL equivalent: OVER],
RCL ST Z,
RCL ST T
[equivalent: R^,
since while copying T {to X} the original T drops out],
Rv,
R^,
STO ST Y,
STO ST Z,
STO ST T,
X<> ST Y
[equivalent: X<>Y],
X<> ST Z,
X<> ST T,
X<>Y
[RPL equivalent: SWAP].
-
Why would one want to clear the RPL stack?
-
Being able to clear the stack
enables garbage collection
(freeing memory).
This is not so much required with
RPN
with its small fixed stack,
but is provided all the same there too.
-
How to calculate mean values on a hp[42s]?
-
HP statistics data is kept in the sigma (Σ) registers;
those registers need to be cleared (zeroed),
before adding the single values (which is done by Σ+; Σ- is to correct mistyped values)
sample: mean of 2, 3, 5, 7, is 4.25:
(f) (CLEAR) CLΣ
2 Σ+ 3 Σ+ 5 Σ+ 7 Σ+
(f) (STAT) MEAN
-
How to enter negative numbers on an hp?
How do I enter -180 on hp42s?
-
(1) Enter the (positive) number first, then negate:
on hp42s i.e.: 1 8 0 +/-.
(2) On older HPs, the 'change sign' key is 'CHS':
i.e.: 1 8 0 CHS.
(3) Similar issue arises with the missing 'xth root of y' key/function:
invert x first:
i.e.: 2 ENTER 1 2 1/x (f) yx.
-
How to input 'st y' on hp42s?
-
Stack arguments are introduced by '.':
i.e. RCL ST Y is entered as: RCL . ST Y.
[
hp42s stack modifier functions
]
-
How to adjust the LCD contrast on an hp42s?
-
Keep the EXIT key pressed while keying + or -.
miscellaneous
-
What is VRML?
Do you feature VRML converters?
Do you feature VRML samples?
VRML applications?
What is wrl?
-
The virtual reality modeling language (VRML)
defines
document formats used to
describe three dimensional structures.
Among other things it supports
colors, textures, surface properties and illuminations.
A VRML viewer application is needed to render the models
and implement [interactive] user-operations
like walk-thru, rotate, etc.
VRML samples can be found here:
ISO glass model
[information],
puzzle pieces model
[information],
torus.
Two of my VRML applications are
the
glass constructor
and a
3-D puzzle solver.
A VRML converter is
m2wrl
[Mathematica graphics to VRML 1.0].
wrl is the usual file suffix used with VRML documents.
-
Where can I find a VRML circle?
-
VRML
is not a suitable format to represent a circle;
VRML is 3D, circles are 2D.
A
torus
(doughnut)
[in Mathematica notation:
ParametricPlot3D[{(2 + Sin[2 Pi u]) Cos[2 Pi t],
(2 + Sin[2 Pi u]) Sin[2 Pi t],
Cos[2 Pi u]},
{t, 0, 1},
{u, 0, 1}]
,
see e.g.
m2wrl]
can be considered circle-like
(only small extensions around a circle-line;
{in this sample however the 'small' dimension u is not
chosen so insignificant, relative to the circle-line t}).
To calculate the points from the formula above [which form two
kinds of circles, see the black lines in
m2wrl-torus.gif],
just run both t and u from 0 to 1
(say, 0.0, 0.1, 0.2, ...).
['glass'
{another VRML generating application using circles}
uses symmetry considerations to calculate only a fraction of
the circle points.]
Circles are not suited as finite VRML elements either;
usually triangles or quadrilaterals [often trapezes,
as in the torus sample] are used.
-
What is the RGB value for the bordeaux color?
What is the bordeaux red color code?
-
Well, the bordeaux red color is unfortunately not in
/usr/X11/lib/X11/rgb.txt
...
However, from
various sources
I have seen
values such as
153 51 102 [#993366]
(which is a
'browser safe palette'
pattern),
92 1 32 [#5c0120]
and
even
171 78 82 [#ab4e52].
[
Bordeaux [red] wine
]
-
What are the two Zürich daily newspapers?
-
NZZ
(Neue Zürcher Zeitung, since 1780)
and
T.A.
(Tages-Anzeiger, since 1893).
-
Jelmoli?
-
You can find many things at
Jelmoli
['house of brands'],
e.g.
Newton Unfiltered Merlot,
Krug,
perfumes,
fashion (Strellson, Boss, ...),
etc,
etc.
-
Where can I find Zürich pictures?
-
lrdev features some Zürich picture collections:
some spots in Zürich
(mainly around Bahnhofstrasse),
Kunsthaus Zürich facade
(Pfauen [Platz];
should have been grown to a VRML model
[but the material and model weren't {yet} good enough]),
Zürich public transportation / Adlisberg (Zürichberg).
-
Since when is this site up?
-
www.lrdev.com
has been up since April 2001;
older contents has been found
on access.ch/lr
since March 1997.
-
Can we hire your expertise?
-
Yes, drop a mail to
Eric Laroche <laroche@lrdev.com>.
A short listing of
my expertise items
can be found on
my homepage.
-
Do you work for Universities too?
-
Yes, drop a mail too to
Eric Laroche <laroche@lrdev.com>.
ask
Disclaimer: There is no warranty of fitness for a particular purpose for this data.
Trademarks: Products mentioned here may be trademarks of their respective companies or organizations.
Patents: Products described here may be protected by patents or pending applications.
Taste: Opinions formulated here may be based on personal taste.
Copyright (C) 2003-2014 Eric Laroche. All rights reserved.
Eric Laroche
/
Fri Dec 19 2014
/
questions?
/
feedback?
/
homepage