The most significant limitation of SwiftRay is its slowness. A few weeks ago, I had already sped it up by exploiting more than one core. Since my Mac has 4 cores, it ran about 3,5 times faster.
The first version of SwiftRay was built after a book, which I bought the second volume, Ray Tracing: the Next Week. It exposes an algorithm, the Bounding Volume Hierarchy, which consists in sorting primitives — only spheres at the moment — in a binary tree. The principle is that instead of testing if a ray hits any object in the scene, we first test if it hits the volume of the whole hierarchy at all. If it does, then we test for the two halves, then their two halves, and so on, until we get to a leaf of the tree — that is a primitive.
The speed-up comes from the facts that:
a lot of rays won’t hit any object at all
we can get quickly to the object that does receive the ray
A bvh_node inherits the hitable “protocol” (thought there’s no such thing in C++). Its constructor method takes a list of hitables, which are actually the primitives in the scene.
I will spare you the implementation of the constructor, which actually produces the binary tree. When the tree has been built, each bvh_node has a left hitable and a right hitable as children, which either have two children as well, or point on the same hitable.
The following method determines whether the volume (bounding box) of the BVH node is hit. If it is, it calls itself recursively for its left and right children to see which one is closer, or it hit at all:
I had a problem: images still rendered, but computations were actually much slower ! By a factor of 2 to 40 depending on the scene.
Since I’m only human, my premise was that my code had an error. And yes it had: the code to determine if a Bounding Box was hit was wrong. This would make the process slower. But once fixed, it still was very slow.
My second thought was that maybe building the tree was buggy. This would also slow down by big amounts the code since it would not tell quickly if a volume was hit. It was very hard to see in the debugger, so I eventually introduced Rectangle primitives, so I could draw Bounding Boxes. And to my surprise, my code was correct!
That was the time I thought that profiling my code could provide me an insight, so I fired up Instruments > Time Profiler, to state that the code spent a lot of time into
Searching the web, I learned that “protocol witness” is a virtualization table to solve the conformance to protocol. That was odd to me. BoundingNode being a struct, I expected to have a straight call to the hit() method, with no indirection.
I watched a video of the WWDC ’16 which proved me I was wrong. Yes, Swift does Dynamic Dispatch on structs when they conform to a protocol.
Once, I learned a little Haskell
A few months ago, I took some time off to learn the basics of Haskell. A lot of exercises in the books have to do with… Trees.
A binary tree in Haskell may be declared as:
In English: A Tree is either a Leaf (with a as data) or a Node with two child trees.
I have watched a presentation which was boring for the most part, but had an interesting point: Haskell’s Algebraic Data types can be implemented as Enums in Swift:
The indirect keyword is a recent addition to Swift (version 3?) and informs the compiler that the recursive definition of the enum is voluntary.
In case you have not yet understood: I am going to replace the struct calling its own method with an enum, to get rid of Dynamic Dispatch.
In the last two years and half, I have been working for Meteo Consult on an internal application running on a Mac, to create 3D meteorological maps, broadcast on the TV channel La Chaîne Météo.
A detailed texture is needed
One major problem we have since the beginning is how to cover the Earth with a texture, since the texture has to be huge to be detailed enough. Currently we are stuck with a smaller texture, which presents two drawbacks:
The Earth is hardly detailed enough, so the minimum altitude of the camera is limited. For example, La Martinique or La Guadeloupe are only small blurry dots. We currently rely on 2D maps instead.
Even with its low resolution, the texture takes a lot of time to load on the GPU; about 3 s on my MacBook Pro 2013.
Hopefully, the application runs on a Mac Pro, which has a lot of GPU RAM; but even if we could load a big texture, GPU generally don’t handle textures larger than 16384 pixels, so we would be stuck anyway.
Probably the solution was obvious to you: use tiles, Boy ! Of course, we thought of that since the beginning of the project, and I even tried to make something work, but to no avail. The major difficulty was to determine which tiles were visible. It’s rather easy on a flat 2D surface, but I could not find a reliable solution for the round 3D surface of the Earth.
Megatexture, also known as “Sparse Virtual Texture”, is a technique to compose a big virtual texture using tiles. The term was coined by John Carmack, who imagined this technique. I’ll stick with “Megatexture” since it sounds much cooler than “Virtual Texture”.
Determining visible tiles
The great insight is how visible tiles are determined: the scene is rendered to an offscreen buffer, with a special fragment shader. In my case, the Megatexture is at most 256 by 256 tiles, and has a maximum of 8 mipmap levels, so the shader stores the tile’s x in the red component, the tile’s y in the green component, the mipmap level in the blue component, and the Texture ID in the alpha component. The scene is rendered to a RGBA8 offscreen buffer.
There may be several megatextures in a same scene. The texture ID permits to differentiate them in the Cache and in the Indirection Table later. Since objects which are not megatextured won’t be processed in the shader, you need to reserve a special Texture ID to mean “No texture”. It must corresponds to the buffer’s clear color, therefore I advise you choose the value 0x00, so it corresponds to a transparent color (since the texture Id is saved to the alpha channel).
Tiles determination shader
I’m sorry but I can’t provide my own code, so I’ll give you Sean Barrett’s instead, who was a pioneer in the technique, and made his code public:
// analytically calculates the mipmap level similar to what OpenGL does
This first part determines the mipmap level. The formula is copied straight from OpenGL’s implementation, so everyone uses the same.
vt_dimensions_pages is the size of a Tile (what Barrett and a number of people call a “Page”, but which I find inappropriate).
vt_dimension is the size the megatexture at the most detailed level (mipmap 0).
you’ll see below that the CPU is going to read the pixels of the offscreen buffer. To save a lot of processing power, the scene is not rendered at full size. readback_reduction_shift is a power of two; since it equals to 2 here, the offscreen buffer is a quarter of the width and height of the final rendering. I personally set this value to 4, and set the width and height of the buffer to 1/16th of the size of my view.
I’m not sure what mip_bias is. I believe this is a way to make the shader less agressive in its changes of mipmap levels, at the cost of the texture being a little blurry at times. (I don’t use it my own implementation).
The second part determines the Tile’s x and y and renders them in the color buffer:
// the tile x/y coordinates depend directly on the virtual texture coords
Note that there is a mistake here: the mipmap level must be floored! Otherwise there will be a discrepancy between the level determined here, and the one determined in the Texturing shader.
A small image:
I changed the way colours are rendered so the image is visible, but the size is real. If the OpenGL view renders at 800 x 600, then the offset buffer is rendered at 1/16th of that, that is 50 x 37.
Loading tiles in the Cache
Reading back the offscreen buffer
I use glReadPixels to get the pixels. Every pixel corresponds to what I call a “Tile Id”: texture Id, x, y, mipmap level. Pixels with the “None” texture ID are discarded immediately. Others are converted to TileId objects, which are added to a NSMutableSet, in order to remove duplicates: since a same tile shows at several places, its TileId will appear several times.
It is not necessary to read the buffer, and therefore determine visible tiles, at every draw cycle. I do it only once every 4 frames (at 60fps = every 15th of second).
Determine the tiles to load
Now we have a list of visible tiles, we can compare them to the ones already in the Cache. The difference is the tiles to load.
In my implementation, tiles are loaded as background tasks, but textures are loaded in GPU memory on the main thread, because we have to with OpenGL. This textures obviously don’t have mip maps, but do use interpolation.
While the tile is loading, you will like to replace it with a “parent” tile— one with lower details — already in the cache. This is not too difficult, since the replacement only consists in a substitution in the Table of Indirection. Since the parent might not be in the Cache either, you should look for the grand-parent or grand-grand-parent, etc. I add the “base tile” (the lowest resolution one) to the set of visible tiles, so I’m always sure that at least the Base tile is in the Cache.
The Cache itself is simply a texture (not mipmapped, but interpolated), which forms a grid of tiles. You need somewhere a table of correspondance between a position in the Cache and a TileId. I use a dictionary, indexed by the TileId. Use glTexSubImage2D() to replace only the part of the texture which contains the new tile.
When the Cache is full, some tiles must be dropped. People and I use a simple Least Recently Used mechanism to determine which ones. It’s simple, it works. I tried other heuristics, based on the mipmap level, to drop the least detailed tiles in last resort, but it did not work great, leading to load the most detailed tiles too frequently.
Dropping a Tile consists in marking its position as free in the table of correspondance. Since it does not perform OpenGL calls, it can be done at any time.
The Cache does not have to be huge: 16 x 16 tiles works. In my experience 8 x 8 tiles is not big enough on a Retina display: the program loads tiles and drops them continually. Make the Cache bigger if you want to remove some burden on the CPU, or have several Megatextures.
A 256 x 256 tile takes 250 KB of memory, so a 16 x 16 tiles cache takes 64 MB. That is very reasonable.
Table of Indirection
The Texturing Shader needs to know what are the coordinates of a Tile in the Cache texture. For that purpose, it is provided a Table of Indirection, which is a mip mapped texture.
A pixel of the texture contains the following information:
x position in the cache (stored in .r)
y position (in .g)
mipmap level (in .b).
For a particular mipmap level, this table has one pixel per tile. For instance, say that my megatexture measures 256 x 256 tiles at mipmap 0, then the texture measures 256 x 256 pixels at mipmap 0. There are only 128 x 128 tiles at mipmap 1, and hence the table measures 128 x 128 pixels at mipmap 1. There is a straight correspondance, so the Texturing Shader determines the tiles x and y according to the texture coordinate, and does a simple look up. (In other words, there is a table of indirection for every mipmap level. All these tables are combined in a single mipmapped texture).
You might wonder why the mipmap level is stored in the table, since it can be determined by the shader. Actually, this is what allows to substitute a parent tile; in that case, the pixel contains the x, y, and mipmap of the parent — not the child. The mipmap level of the parent tile is needed to determine correctly the position within the parent tile.
Finally, we arrive at the end of the chain !
uniform sampler2D pageTableTexture;
uniform sampler2D physicalTexture;
// converts the virtual texture coordinates to the physical texcoords
Older versions of OpenGL (like the one I’m constrained to use, because of Apple), did not allow to sample the texture for a particular mip map level. However, texture2D() may take a third parameter, which is a value added to the implicit mipmap level (the one computed by OpenGL). I don’t know why 0.5 is substracted, but it works better this way.
I must say that I had a lot of problems with this principle because it assumes that:
Tiles are square
The megatexture is square
Since I had to cover the Earth, my megatexture was not square, but had a 2:1 ratio instead. And my tiles were 512 x 256 pixels. If this is not the case, you will run into troubles, because the computation of the mipmap level is right vertically, but not horizontally, and you will have visual artifacts, since the wrong mipmap level is sampled from the indirection texture.So, don’t do that: make your tiles square and stretch your megatexture if needed. It will save you a lot of pain.
(With a more recent version of GLSL, you might use texture2DLod(), and compute the mipmap level like in the Determination Shader, and not have this problem).
We’re not done yet! Remember that the megatexture is a huge image that must be cut into tiles. I personally wrote a Python program that uses Image Magick to cut tiles and resize them. I won’t go into details here, but you must know that Image Magick is slow, and not very user friendly (and that by default, rescaling is proportional). You may do it otherwise, maybe using a Photoshop script or whatever.
There is one final problem with the principle of the megatexture itself. Because tiles are all stored in the Cache texture in a random order, a tile is unrelated with its neighbours. This causes visual problems because of the linear interpolation of tiles, which will cause half a pixel of neighbour textures to show:
The solution is well known: leave a margin of 1 pixel on each side, and sample at this size. Hence the actual usable size is 254 x 254 pixels on my 256 x 256 tiles.
I could not have made my Megatexture work without the following resources:
This was my main source of inspiration, because it is very synthetic, covers most issues and guides toward a practical implementation. I don’t use his principle for the Indirection Tables though, which I find awkward. Maybe he could not do otherwise in WebGL.
The example of Sean Barrett
There is a video, but I found it rather difficult to follow. It does not explain the basic technique well, but it might be interesting if you want to handle tri-linear filtering (which I don’t). You might also like to take a look at the source code, since most shaders written by other folks are based on it.
Thesis by Albert Julian Mayer
This is really interesting as it sums up a lot of the techniques that are known for virtual texturing. You should definitely take a look if there are details you did not understand in my post, or if you want to push the technique further.
I decided to take a month off from client projets, so I could work on subjects which I don’t usually have the time to work on.
Since I began learning Haskell lately, I knew that I had to code a real project. Actually, I think it’s the only way to study programming seriously: stick to a problem and find ways to attack it. A Raytracer looked like a reasonable idea for a project, since Raytracers use recursivity, which is the specialty of functional languages like Haskell.
Anyway, to sum up: I began with Haskell, and I ended with Swift. I met some difficulties regarding pseudo-random numbers in Haskell, I was tired and I was not sure about what was wrong (this was my first raytracer). I have not really given up, just passed on to keep my motivation.
The original source code was written in C++. I was asked if porting it to Swift had been hard: Not really. Sometimes it was difficult to follow the C++ code, because of the way it is written, but Swift is way more elegant than C++, and has all required features — in particular operators overloading, which are more than useful when working with vectors.
The second question was how it performs, compared to C++. I don’t know, since I have not measured. I don’t care really, and that was not an aim for my experiment. All I can say is that it’s about the speed I expected: very slow. Shirley uses a “Pathtracer” algorithm. Wikipedia says that this is a characteristic of these raytracers. It already takes hours to render with 100 rays by pixel (112 minutes on my Mac for the very small and simple image above), and the image is still very noisy. At least 500 rays would be needed!
Currently the program uses a single thread. The obvious next step is therefore to parallelize the work so I can use all 4 cores of my Mac. Since I mostly used Structs (≃ immutable objects), it should be easy. I let you know when I find the time…
I’ve been reading Yegor Bugayenko’s blog (yegor256.com) for a year or so. At the time I was struggling with Core Data, and I could not really explain why until I stumbled upon one of its blog posts entitled along the lines of “Why ORMs are evil”. This post made explicit in my mind what I felt was wrong with ORMs but I could not articulate, and also provided an alternative.
At the time, I was studying a little of Functional Programming and was begining to think that what was wrong might be Object-Oriented Programming in the first place. And then Yegor’s blog opened up my eyes, and I discovered I have been doing it wrong for ages. Not that my code was terrible; it was actually very close to the standards of our industry — which means not so good.
His blog confirmed I was on the right direction on some things: for example, my latest code was written so my objects were immutable, which proved to make them easier to design and test, without any inconvenient in practice. It also made me reconsider my use of abstract classes, by using small protocols (you would say “interfaces” in Java) instead.
A manifesto for Object Thinking
The book is a kind of collection of the most emblematic blog posts he had written. However, it is certainly not copy-and-paste. The book is well organised into four parts — Birth, Education, Employment, Retirement — carefully chosen to emphasize the anthropomorphic nature of Objects. The chapters and paragraphs themselves were rewritten to make the whole book consistent.
Yegor Bugayenko thinks our industry is all wrong with OOP. People on its blog frequently treat him of an «OOP extremist», which he would take as a compliment! As such, the book is very cleaving, with frequent words like “evil”, “all wrong”, “you must”, “I think”. It is very opiniated, which is its greatest quality, a book is meant to present things an other way; otherwise you would not learn anything.
What distingues its discourse from trolling is that each point is argumented. The author tries to convince with examples, how they are wrong and how they could be made better. Most examples are great, a few are awkward, but in all manners, they have the merit to make the reader think.
I would recommend the book to any seasoned OOP programmer, although it is not perfect. In its current state, it looks a lot like a manifesto: it strongly tells what the author is against, but not enough what can be done instead. I wished the author had better explained alternatives that he uses, like the Decorator design pattern, or how he passes dependencies around the application, for example when they are shared resources.
But maybe this first edition had to look like a manifesto, because this thinking is too radical. I wish the second edition will be less defensive and will provide more practical examples.
Je reçois régulièrement des solicitations pour développer des applications iOS, en échange de parts sur des gains éventuels. La dernière demande était bien plus sérieuse que d’habitude, j’ai donc décidé de répondre de façon détaillée.
J’y explique pourquoi j’ai refusé toutes les demandes qui m’ont été faites jusqu’ici.
Je suis actuellement pris dans une mission de longue durée; je ne peux donc pas répondre favorablement à votre demande.
Pour être tout à fait honnête, j’aurais probablement refusé. Laissez-moi vous expliquer pourquoi:
Depuis quatre ans, j’ai déjà été sollicité de nombreuses fois pour réaliser des développements, avec des projets plus ou moins sérieux.
L’une de ces solicitations venait d’un ami d’un ami que j’avais déjà eu l’occasion de rencontrer. À cette époque, je débutais comme indépendant, et je n’avais pas de missions, aussi j’avais du temps, et le besoin de faire mes premières réalisations. La première discussion s’est très bien passée: le projet semblait un peu trop ambitieux, mais a priori rentable. Un truc dans l’hôtellerie. Il s’agissait d’une sorte d’appli iPhone en marque blanche que nous allions personnaliser pour chaque hôtel.
Evidemment, ils n’avaient pas d’argent, donc je devais travailler en échange d’une part sur les gains de la vente de l’appli. Tout le monde avait l’air heureux, aussi je me suis mis au travail; nous allions régler les questions contractuelles dans la semaine.
La première douche froide fut quand je reçus les premières conditions contractuelles: non seulement on me proposait peu (parce qu’ils avaient beaucoup de frais de déplacement pour acquérir les clients), mais ils voulaient même que je leur cède la propriété intellectuelle de mon travail.
La deuxième douche froide fut qu’au fil de la semaine, ils ont commencé à réclamer fonctionnalité sur fonctionnalité, ce qui signifiait concrètement pour moi, que je devais travailler plus longtemps. En d’autres termes, j’investissais également les gains financiers que j’aurais eu en travaillant pour d’autres clients.
Finalement, j’ai mi le hola: j’acceptais le partage des gains proposé, mais je délimitais clairement ce que ferait l’appli iPhone. Les discussions se sont arrêtées là. Je ne l’ai jamais regretté.
Ce jour-là, j’ai compris une chose: c’est une relation commerciale qui n’est pas saine. Quand je fais de la prestation, le client et moi délimitons un périmètre, et un prix. Si c’est trop cher, nous pouvons réduire le périmètre.
Mais dans la situation évoquée avant, le client et moi avons des intérêts divergents: moi de travailler le moins longtemps possible pour investir le moins possible, et lui d’avoir le maximum de fonctionnalités pour que son offre soit la plus sexy possible.
Par ailleurs, je ne suis pas un investisseur. Je ne dispose pas d’une réserve pécuniaire suffisante pour travailler des mois pour un gain potentiel. D’autant plus qu’avant d’investir, on se doit d’évaluer le potentiel du projet, et surtout l’équipe.
Parce qu’en pratique, ce que j’ai pu souvent observé, c’est que le porteur de projet n’y apporte rien qu’une idée, une vague expérience et un vague réseau professionnel. À me demander quel serait mon intérêt de m’associer avec un tel partenaire: c’est moi qui travaille et nous devons nous partager les gains.
(Je précise que la phrase précédente ne s’applique pas forcément à vous. En fait, je ne sais rien de votre situation, vous avez sans doute plus qu’une idée, d’autant plus que votre projet a l’air déjà assez avancé).
Bref, j’espère que vous ne m’en tiendrez pas rigueur, et j’ai pris le temps de vous écrire en toute honnêteté le fond de ma pensée; j’espère que cette honnêteté vous aidera à comprendre pourquoi il vous sera difficile de trouver un développeur iOS qualifié pour votre projet. Mais pourquoi pas aussi éventuellement à convaincre un éventuel développeur qui serait éventuellement prêt à se lancer dans l’aventure.
I have met two serious limitations when using Scene Kit’s -[SCNSceneRenderer hitTest:options:] method:
it is not sufficient to add a node to the scene for the hit test to find it. The scene must also have been rendered once.
the method takes a long time. In my example, about 20 ms in a very simple scene which only contains a couple of spheres. Unfortunately, I had to do it 2000 times, so it takes 40 s ! Totally unusable in my case.
The method seems to have been designed for user interaction, and is only suitable for that case.
(I finally solved the problem by coding my own hit testing, which was possible because I work with a simple sphere. It was not easy because of the lack of information on what the matrix of the SCNCamera really contains, but I eventually managed to reduce the time from 40 s to a couple of milliseconds).
Altough keeping a child File Wrapper as a property does not look stupid beforehand, you will run into problems when doing so. The reason is because the parent wrapper might replace its instances of child wrappers by other instances if its see fit.
The solution is to always consult the .fileWarpers property of the parent Wrapper and enumerate child wrappers from it.
There is no method to update a FileWrapper. If the content changes, remove the wrapper from its parent and add a new one.
Files named 1_#%$#%$_MyFile are created
This mechanism is used to ensure that the file names are unique. Therefore, if a file named “MyFile” already exists in the directory, and a child File Warper is added with its preferedName set as “MyFile” too, the second file will be named “1_#%$#%$_MyFile”. Hence the name of the “preferedName” property.
In general, this happens because you’ve messed up with the child wrappers — for example, when updating (see above), you forgot to remove the old wrapper before adding the new one with the same name.
# “File already exists” error when calling writeToURL
I’ve run into this during unit testing.
Say you have a file package on disk, and you want to update it. Calling writeToURL:options:originalContentsURL:error: will provoke an error “File already exists”. Pass the NSFileWrapperWritingAtomic and it will work.
I think the reason is because, by default, Apple engineers wanted to ensure that the file would remain untouched if the saving failed. It looks like NSDocument will ensure this, so it does not even set this option.
zxing dropped support for the iOS platform recently. This leaves us with Zbar.
Zbar pre-compiled library does not support modern architectures
However, there is a problem with the Zbar library found in the SDK: it’s been compiled for the armv6, armv7 and i386 architectures only. Therefore, if your application supports more modern architectures like armv7s or arm64, it won’t link with the precompiled library.
Compiling from the source code
The solution is to compile Zbar yourself from the source code, which you can clone using Mercurial. You shall find explanations for Xcode 4 here, and Xcode 5 here.
However, there is an easier way. Someone had the great idea to clone the original Mercurial repository to github and also created a Podspec, so Zbar may be included in your project using Cocoapods.
A client wanted his app to show a list of contacts. He wanted each contacts whose name began with the same letter to be in the same section of a table view. However, he was not satisfied with the first version I coded, because empty sections would show. For instance, if there was no contact whose name began with a B, there would be an empty B section.
So, I went back to my code, and I came up with this result:
My code only handles letters from A to Z, with no diacritics. I was not aware of the UILocalizedIndexedCollation class at the time I programmed the first version, and when I discovered it, it seemed to present an interesting feature: it can handle other locales, in particular, Arabic or Asian alphabets.
For the second version, I thought it would be better using it, but I really wasted two hours:
UILocalizedIndexedCollation uses a fixed list of indexes, so empty sessions are shown. Just like the first version I came up with. I thought Apple would have had produced a much better implementation.