mightymandel v16

GPU-based Mandelbrot set explorer

TODO

Easy

  • make interactive responsiveness vs efficiency settable.
    • new command line flag that sets timeout to render_calculate in the main loop
  • --batch mode (list of parameter files on stdin or a file).
    • would work like --one-shot for each file, but should be quicker (less overheads).
    • tricky bits would mean avoiding using glfwSetWindowShouldClose() in some places in main()
  • change weight keys to use a few adjusters (weight up, weight down, reset)
    • with small/big options like some of the other things using shift and ctrl.
    • but most keys are tricky with shift, depends on keyboard layout - maybe just use ctrl
  • make approx/no-approx work at runtime.
    • add a keyboard shortcut (perhaps 'a') that toggles the flag.
    • it should output a message with the new status
    • it should set the updated flag so that rendering is restarted
  • --zoom-start N to start from further down a zoom, default 0.
    • need to adjust --zoom count implementation to render the correct number of frames
  • --zoom-step N to advance by N frames each time, default 1.
    • could be useful for splitting a render job over multiple machines.
    • need to think about --zoom count behaviour
  • --tile-start MxN to start from a particular tile, default 0x0.
    • could be useful if rendering was interrupted or a tile was rendered badly
  • --zoom-factor F (default 0.5) to set ratio between successive frame radius when zooming.
    • might be used to render all video frames at maximum quality for perfectionists.
    • the extra/zoom assembler interpolation adds undesirable artifacts (time-variant blurring)
  • --print-zoom-count option to show the number of frames needed to reach the final view. should respect --zoom-factor, not sure about --zoom-start and --zoom-step. Calculate like: -log(256/finalRadius)/log(zoomFactor)
  • NOTICE interactive key control usage when F1 is pressed (help key)

Tricky

  • --fp32-iters N, --fp64-iters N, --fpxx-iters N to set FP??_STEP_ITERS at startup instead of hardcoding in config.glsl.
    • round to power of two, WARN if not equal to what was requested
    • set an FPXX_MAX_STEP_ITERS for use in struct fpxx_step array
    • ERROR and clamp if --fpxx-iters rounded power of two is too high
    • use global variables to store the values
    • use snprintf() with a static buffer, add the DE define there too
  • configuration file in ~/.mightymandel/config
    • use command line flag syntax
    • allow comments on lines beginning with #
    • needs a parser with support for "quoting" and 'quoting' to create new argc/argv
    • should it expand ~/ and environment variables?
    • command line flag to create the directory and write a default configuration with comments
    • WARN if no config found and suggest the flag to create one
    • command line parameters should take precedence, but
    • there should be a command line flag to ignore the configuration file without warning
  • compute reference orbit iterations in a background thread to improve throughput
  • make extra/zoom auto-detect frame size
  • make end key (with ctrl) jump to a view of the central minibrot if there is one
  • make end key (with shift) jump to a view of its 2-fold embedded Julia set
    • use partials of central minibrot to get the outer minibrot period
      • try partials from penultimate down
      • find the first partial whose muatom is cardioid (use shape estimate)
    • period of the atoms in the 2-fold embedded Julia set is inner+outer period
    • use central minibrot atom domain size estimate to get a radius
    • try a few points around the atom domain to get the embedded Julia muatom
    • set the view radius so that both inner and embedded Julia atoms are visible
  • make home key (with ctrl) jump to a view of the outer minibrot
    • use partials of central minibrot to get the outer minibrot period
      • try partials from penultimate down
      • find the first partial whose muatom is cardioid (use shape estimate)
  • make home key (with ctrl+shift) jump back to the initial view
  • add command line option for setting output file names
    • plain filename for –one-shot
    • with one %%04d (or similar) filled by sequence number for --interactive
    • with two %%04d (or similar) filled by tile x y for --tile
    • with one %%04d (or similar) filled by frame number for --zoom
    • with three %%04d (or similar filled by frame number and tile x y for --tile --zoom
    • with one %s (or similar) filled by basename of input file for --batch
      • maybe add S for basename without extension
    • numbers should start from 0
    • security risk
      • snprintf shouldn't be used with the file pattern directly
      • instead the file pattern should be parsed
      • if any unrecognised %% options occur, fail with an error
      • if any unrecognised \ options occur, fail with an error
      • if the number and type of patterns is different from the expected count and type of arguments, fail with an error
      • only if all these tests pass, use the file pattern with the correct arguments
    • write a bool snprintf_check(const char *pattern, const char *types) function
    • types would be a string like "sddd" for --batch --zoom --tile
    • it would return true if all the checks pass
    • the pattern string should be copied from argv first, just in case some other code modifies it (global mutable state is evil)
  • find all the places where bits are computed to determine render method
    • make all of them use a render_method_t that gets computed once and stored somewhere appropriate
  • add a --debug which option that allows debug logs to be limited to certain aspects
    • allow multiple debug flags to be specified
    • perhaps use a bitfield to store the set of debug flags
    • add a log_debug function to take the extra aspect argument
    • find and replace all the log_message(LOG_DEBUG, with log_debug(DEBUG_FOO, with FOO relevant to what is being debugged
  • allow --tile to be used with --batch
  • native fractint parameter file support (for first parameter in file only)
    • requires porting the tr and sed stuff in extra/split2ppar.sh to C
    • and modify the script (call it split2par.sh) to not preprocess, just split
    • see fractint 20.4 DOS version source http://www.nahee.com/spanky/pub/fractals/programs/ibmpc/frasr200.zip
    • but don't copy the source code: "The source code for a modified version of Fractint may not be distributed."
  • optimize memory usage by reallocating vbo for fp32/fp64/fpxx (and de/no-de?)
    • currently memory is allocated for fpxx even though fp32 only needs a small fraction of it
    • would enable larger images to be rendered at lower zooms
    • risks out of video memory errors at runtime (after startup)
    • risks poor performance if driver uses system memory as vram swap
  • perhaps compute a small preview image first when tiling
    • could be useful for checking if a whole tile would be interior
  • visualisation of used reference points
    • small circles or crosses with the current one highlighted
    • needs command line flags to set the default state
    • and an interactive keyboard command (perhaps 'r') to toggle showing them
    • needs a function to transform from coordinates to pixel coordinates
    • needs to draw a number of quads with one highlighted
    • generating the circle or cross in a fragment shader seems easiest
  • persistent state / preferences for interactive mode
    • store parameter file, plus also whether de calculation is enabled
  • adjust version string generation in the build system
    • only re-generate version.c in git checkouts
    • add a make sdist that generates version.c
    • this should make compiling from a tarball work as expected
  • save raw iteration data to allow external colouring or other manipulation
    • save it interleaved for streaming processing
    • or in separate files for each plane
    • save it in a way that it could possibly later be reloaded without much difficulty
    • needs a header with settings used to render
  • automatic DE weight selection
    • binary search for a weight that makes the average of the image a certain grey level
    • do it when rendering is all done
    • ignore interior pixels
  • avoid recomputing large interior regions
    • if found a central minibrot and central minibrot is visible (radius > threshold based on view radius)
    • then the large interior regions are very likely to be really interior to that minibrot and its islands

Hard

  • generate preview texture from previous view and colour uncalculated pixels using it
    • requires saving the rgb framebuffer to a texture
    • requires computing an affine transformation to transform from pixel coordinates in the new image to pixel coordinates in the old image
    • then fp32_colour would need to look up uncalculated pixels in the texture
    • could also be useful for checking if a whole frame would be interior
  • re-use references
    • give each reference a score for how many pixels it de-glitched
    • re-use the best references for other tiles / zoom / interactive frames
    • picking higher scoring references first, and updating their scores
    • when to garbage-collect?
    • whether storing the reference iterations would be worth it
      • possibly yes, at least for tiling
      • reference orbits already saved for multiple slice rendering
  • make de/no-de work at runtime
    • on startup, compile shaders twice, once with de and once with no-de
    • add a key to switch modes (requires recalculating image)
    • keep track of whether image was calculated with de or no-de
    • adjust fp32_colour_frag.glsl to colour with no-de even if calc'd with de
    • add a key to switch colouring modes (does nothing if image was calc'd with no-de)
  • restore render settings when loading
    • requires changing the parser convention to return more data
    • probably best to have struct file_options { T option; bool option_set; ... }
    • requires changing all the parsers to support the new interface
    • requires adding extra parsing to parsers for formats that have some ptions
  • save series approximation coefficients with references
    • these are the 6 complex values needed to initialize the iterations
    • plus the number of iterations that have been skipped (and radius?)
  • use OpenGL debug output extension (core from 4.3) to replace ;D;
    • need to check and load extension for OpenGL 3.3 and 4.1
    • D would still need to check if the extension is activated and if so do nothing
    • need to gather data of how widespread the extension support is (drivers, GPUs)
    • only if it is almost everywhere can D be removed
  • efficient translation (calculate only borders)
  • investigate multiple stream transform feedback for one-pass processing
    • initial research suggests it won't help:
    • needs a vbo to capture output from stream 0 even if we don't want to keep it
  • autotoolize mightymandel to hopefully make all the libglfw3.a issues disappear
  • doxygenate the whole source code base
  • use more accurate method for checking series approximation skip count
  • give --slice S finer granularity (2^S instead of 4^S) by alternating between horizontal and vertical subdivision
    • fillc needs to cope with non-square slice aspect
    • needs two mipmapped slice table textures, one square for even S, one rectangular for odd S
    • colour needs to look up in the right texture for a particular slice count
  • compute more than one slice at once if video memory allows
    • only applicable to fpxx rendering (all slices are full for fp32 and fp64)
    • keep track of memory available in VBO
    • after fpxx_init(), see if there is enough space in the VBO for another slice of around the same number of iterates
    • keep appending slices until it overflows, then backtrack one slice
    • is overflow safe?
    • alternatively, before starting, download the glitch image and use a CPU memory copy of the slice tables to count how many slices will actually fit first
  • parallelize reference point calculations using OpenMP (possibly will work best for series approximation with a large number of terms).
    bool accurate = true;
    #pragma omp parallel for shared(accurate)
    for (int term_id = 0; term_id < term_id_max; ++term_id) {
      for (int i = 0; i < iterations && accurate; ++i) {
        switch (term_id) {
          case term_z: z_new = z^2 + c; break;
          ...
        }
        #pragma omp barrier
        #pragma omp master
        {
          accurate = ...;
        }
        if (accurate) {
          switch (term_id) {
            case term_z: z = z_new; break;
            ...
          }
        }
        #pragma omp barrier
      }
    }
    

Roadmap

  • port the computation part from OpenGL to OpenCL
    • much more suited to the computational tasks
    • supports both CPU/GPU backend so machines with lower spec'd GPU could use it
  • make libmightymandel so that the renderer can be used by other programs